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(57) An enzymaticalfy active DNA polymerase hav- 
ing between 540 and 582 amino acids having a tyrosine 
at a position equivalent to position 667 of Taq DNA 
polymerase, wherein said polymerase lacks 5' to 3' ex- 
onuclease activity, and wherein said polymerase has at 



least 95% homology in its amino acid sequence to the 
DNA polymerase of Thermus aquaticus . Thermus fla- 
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Description 

Background of the Invention 

5 The present invention relates to novel thermostable DNA polymerases, the genes and vectors encoding them and 

their use in DNA sequencing. 

US Patents 4,889,818 and 5,079,352 describe the isolation and expression of a DNA polymerase known as Taq 

DNA Polymerase (hereinafter referred to as Taq). It is reported that amino-terminal deletions wherein approximately 
one-third of the coding sequence is absent have resulted in producing a gene product that is quite active tn polymerase 
10 assays. Taq is described as being of use in PGR (polymerase chain reaction). 
US Patent No. 5,075.216 describes the use of Taq in DNA sequencing. 

International patent application WO 92/06/06188 describes a DNA polymerase having an identical amino acid 
sequence to Taq except that it lacks the N-terminal 235 amino acids of Taq and its use in sequencing. This. DNA 
polymerase is known as A Taq. . ^ . 

IS US Patent 4,795,699 describes the use of T7 type DNA polymerases (T7) in DNA sequencing. These are of great 

use in DNA sequencing in that they incorporate dideoxy nucleoside triphosphates (NTPs) with an efficiency comparable 
to the incorporation of deoxy NTPs; other polymerases incorporate dideoxy NTPs far less efficiently which requires 
comparatively large quantities of these to be present in sequencing reactions. 

At the DOE Contractor-Grantee Workshop (Nov. 13-17, 1994, Santa Fe) and the I. Robert Lehman Symposium 

20 (Nov 11-14, 1994. Sonoma), Prof. S. Tabor identified a site in DNA polymerases that can be modified to incorporate 
dideoxy NTPs more efficiently. He reported that the presence or absence of a single hydroxy group (tyrosine vs. phe- • 
nylalanine) at a highly conserved position on E coli, DNA Polymerase 1, T7, and Taq makes more than a 1000-fold 
difference in their ability to discriminate against dideoxy NTPs. (See also European Patent Application 94203433.1 
published May 31 . 1995. Publication No. 0 655 506 A1 and hereby incorporated by reference herein,) 

25 . 

Summan/ of the Invention 

The present invention provides a DNA polymerase having an amino acid sequence differeniiaied from Taq in that 
it lacks the N-lerminal 272 amino acids and has the phenylalanine at position 667 (of native Taq) replaced by tyrosine. 

30 Preferably, the DNA polymerase has methionine at position 1 (equivalent to position 272 of Taq) (hereinafter referred 
to as FY2) The full DNA sequence is given as Fig 1 (SEQ. ID. NO. 1). Included-within the scope of the present invention 
are DNA polymerases having substantially identical amino acid sequences to the above which retain thermostability 
and efficient incorporation of dideoxy NTPs. 

By a substantially identical amino acid sequence is meant a sequence which contains 540 to 582 amino acids that 

3S may have conservative amino acid changes compared with Taq which do .not significantly influence thermostability or 
nucleotide incorporation, i.e. other than the phenylalanine to tyrosine conversion. Such changes include substitution 
of like charged amino acids for one another, or amino acids with small side chains for other small side chains, e^. ala 
for val. More drastic changes may be introduced at noncritical regions where tittle or no effect on polymerase activity 
is observed by such a change/ 

40 The invention also features DNA polymerases that lack between 251 and 293 (preferably 271 or 272) of the N- 

terminal amino acids of Thermus flavus (Tfi) and have the phenylalanine at position 666 (of native Tfl) replaced by 
tyrosine: and those that lack between 253 and 295 (preferably 274) of the N-terminal amino acids of Thermus ther- 
mophHus (Tth) and have the phenylalanine at position 669 (of native Tth) replaced by tyrosine. 

By efficient incorporation of dideoxy NTPs is meant the ability of a polymerase to show little, if any, discrimination 

^5 in the incorporation of ddNTPs when compared with dNTPs. Suitably efficient discrimination is less than 1:10 and 
preferably less than 1:5. Such discnmination can be measured by procedures known in the art. 

One preferred substantially identical amino acid sequence to that given above is that which contains 562 amino 
acids having methionine at position 1 and alanine at position 2 (corresponding to positions 271 and 272 of native Taq) 
(hereinafter referred to as FY3). A full DNA sequence is given as Fig. 2. This is a preferred DNA polymerase for 

•50 expression by. a gene of the present invention. 

The purified DNA polymerases FY2 and FY3 both give a single polypeptide band on SDS polyacrylamide gels, 
unlike A Taq. having either a phenylalanine or tyrosine at position 667 wnich forms several polypeptide bands of similar 
size on SDS polyacrylamide gels. 

A second preferred substantially identical amino acid sequence is th^i which lacks 274 of the N-terminal amino 

55 acids of Thermus thermoohiius having methionine at position 1 nnd the phonylnlanine to tyrosine mutation at position 
396 (corresponding to position 669 of native Tth) (hereinafter rolcr 'cc io ^s F Y4) A full DNA sequence is given as Fig. 
5 (SEQ. ID NO. 14). 

The present invention also provides a gene encoding a DNA i;-: :v:-:f.-!sc ol the present invention. In order to assist 
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in the expression of the DNA polymerase activity, the modified gene preferably codes for a methionine residue at 
position 1 of the new DNA polymerase. In addition, in one preferred embodiment of the invention, the niodified Qsne 
also codes for an alanine at position .2 (corresponding to position 272 of native Taq). 

!h a further aspect, the present invention provides a vector containing the gene encoding the DNA polymerase 
s activity of the present invention, e.g. . encoding an amino acid sequence differentiated from native Taq in that it lacks 
the N-terminal 272 amino acids and has phenylalanine at position 396 (equivalent to position 667 of Taq) replaced by 
tyrosine or a substantially identical amino acid sequence thereto. 

In a yet further aspect, the present invention provides a host cell comprising a vector containing the gene encoding 
the DNA polymerase activity of the present invention, e.g. . encoding an amino acid sequence differentiated from native 
^ 10 Taq in that it lacks the N-terminal 272 amino acids and has phenylalanine at position 396 (equivalent to position 667 
of native Taq) replaced by tyrosine or a substantially identical amino acid sequence thereto. 

The DNA polymerases of the present invention are preferably in a purified form. By purified form is meant that the 
DNA polymerase is isolated from a majority of host cell proteins normally associated with it; preferably the polymerase 
is at least 10% (w/w) of the protein of a preparation, even more preferably it is provided as a homogeneous preparation, 
e.g. . a homogeneous solution. Preferably the DNA polymerase is a single polypeptide on an SDS poiyacrylamide gel. 

The DNA polymerases of the present invention are suitably used in sequencing, preferably in combination with a 
pyrophosphatase. Accordingly, the present invention provides a composition which comprises a DNA polymerase of 
the present invention in combination with a pyrophosphatase, preferably a thermostable pyrophosphatase such as 
Thermoplasma acidophiium pyrophosphatase. (Schafer, G. and Richter, O.H. (1992) Eur. J. Biochem. 209 . 351-355). 
20 The DNA polymerases of the present invention can be constructed'using standard techniques. By way of example, . - 

mutagenic PGR primers can be designed to incorporate the desired Phe to Tyr amino acid change (FY mutation) in 
one primer. In our hands these primers also carried restriction sites that are found internally in the sequence of the Taq 
polymerase gene clone of Delta Taq, pWB253, which was used by us as template DNA. However, the same PGR 
product can be generated with this primer pair from any clone of Taq or with genomic DNA isolated directly from 
25 Thermus aquaticus. The PGR product encoding only part of the gene is then digested with the appropriate restriction 
enzymes and used as a replacement sequence tor the clone of Delta Taq digested with the same restriction enzymes. 
In our hands the resulting plasmid was designated pW5253Y. The presence of the mutation can be verified by DNA 
" sequencing of the amplified region of the gene. 

. Further primers can be prepared that encode for a methipnine residue at the N-terminus that is not found at the 
30 corresponding position of Taq, the sequence continuing with amino acid residue 273. These primers can be used with 
a suitable plasmid, e.g. . pWB253Y DNA, as a template for amplification and the amplified gene inserted into a vector. 
e.g. . pRE2. to create a gene, e.g. . pRE273Y, encoding the polymerase (FY2). The entire gene can be verified by DNA 
sequencing. 

Improved expression of the DNA polymerases of the present invention in the pRE2 expression vector was obtained 

35 by creating further genes. pREFY2pref (encoding a protein identical to FY2) and pREFY3 encoding FY3. A mutagenic 
PGR primer was used to introduce silent codon changes (i.e., the amino acid encoded is not-changed) at the amino 
terminus of the protein which did not affect the sequence of the polypeptide. These changes led to increased production 
of FY2 polymerase. FY3 was designed to promote increased translation efficiency in vivo, in addition to the silent codon 
changes introduced in pREFY2pref, a GCT codon was added in the second position (SEQ. ID. NO. 2), as occurs 

^0 frequently in strongly expressed genes in E cofi. This adds an amino acid to the sequence of FY2. and hence the 
protein was given its own designation FY3. Both constructs produce more enzyme than'pRE273Y. 

Silent codon changes such as the following increase protein production in E. coli: 
substitution of the codon GAG for GAA; 
■ substitution of the codon AGG, AGA. CGG or CGA for CGT orCGC; 

^5 substitution of the codon CTT GTG, GTA. TTG or TTA for GTG; substitution of the codon ATA for ATT or ATG; 
substitution of the codon GGG or GG A for GGT or GGC. 

The present invention also provides a method for determining the nucleotide base sequence of a DNA molecule. 
The method includes providing a DNA molecule annealed with a primer molecule able to hybridize to the DNA molecule; 
and incubating the annealed molecules in a vessel containing at least one deoxynucleotide triphosphate, and a DNA 

50 polymerase of the present invention. Also provided is at least one DNA synthesis terminating agent which terminates 
DNA synthesis at a specific nucleotide base. The method further includes separating the DNA products of the incubating 
reaction according to size, whereby at least a part of the nucleotide base sequence of the DNA molecule can be 
determined. 

In preferred embodiments, the sequencing is performed at a temperature above SO°C, 60°G, or 70'*C. ' 
55 . In other preferred embodiments, the DNA polymerase has less than 1000, 250. 100. 50, 10 or even 2 units of 
exonuclease activity per mg of polymerase (measured by standard procedure, see below) and is able to utilize primers 
having only 4, 6 or 10 bases: and the concentration of all four deoxynucleoside triphosphates at the start of the incu- 
bating step is sufficient to allow DNA synthesis to continue until terminated by the agent, e.g. . a ddNTP. 
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For cycle sequencing, the DNA polynnerases of the present invention nnake it possible to use significantly lower 
amounts of dideoxynucleotides compared to naturally occurring enzymes. That is, the method involves providing an 
excess amount of deoxynucleotides to all four dideoxynucleotides in a cycle sequencing reaction, and performing the 
cycle sequencing reaction. 

s Preferably, more than 2, 5, 10 or even 100 fold excess of a dNTP is provided to the corresponding ddNTP. 

In a related aspect, the invention features a kit or^solution for DNA sequencing including a DNA polymerase of the 
present Invention and a reagent necessary for the sequencing such as dITP deaza GTP, a chain terminating agent 
such as a ddNTP. and a manganese-containing solution or powder and optionally a pyrophosphatase. 

In another aspect, the invention features a method for providing a DNA polymerase of the present invention by 
10 providing a nucleic acid sequence encoding the modified DNA polymerase, expressing the nucleic acid within a host 
cell, and purifying the DNA polymerase from the host cell. 

In another related aspect, the invention features a method for sequencing a strand of DNA essentially as deschbed 
above with one or more (preferably 2. 3 or 4) deoxyribonucleoside triphosphates, a DNA polymerase of the present 
invention, and a first chain terminating agent, the DNA polymerase causes the primer to be elongated to form a first 
series of first DNA products differing in the length of the elongated primer, each first DNA product having a chain 
terminating agent at its elongated end, and the number of molecules of each first DNA products being approximately 
the same for substantially all DNA products differing in length by no more than 20 bases. The method also features 
providing a second chain terminating agent in the hybridized mixture at a concentration different from the first chain 
terminating agent, wherein the DNA polymerase causes production of a second series of second DNA products differing 
20 in the length of the elongated primer, with each second DNA product having the second chain terminating agent at its 
elongated end. The number of molecules of each second DNA product is approximately the same for substantially all 
second DNA products differing in length from each other by from 1 to 20 bases, and is distinctly different from the 
number of molecules of all the first DNA products having a length differing by no more than 20 bases from that of said 
second DNA products. 

25 In preferred embodiments, three or four such chain terminating agents can be used to make different products 

and the sequence reaction is provided with a magnesium ion. or even a manganese or iron ion (e.g. . at a concentration 
between 0.05 and 100 m(\/l, preferably between 1 and 10 mM); and the DNA products are separated according to 
molecular weight in four or less lanes of a gel. 

In another related aspect, the invention features a. method for sequencing a nucleic acid by combining an oligo- 

30 nucleotide primer, a nucleic acid to be sequenced, between one and four deoxyribonucleoside triphosphates, a DNA 
polymerase of the present invention, and at least two chain terminating agents in different amounts, under conditions 
favoring extension of the oligonucleotide phmer to form nucleic acid fragments complementary to the nucleic acid to 
be sequenced. For example, the chain terminating agent may be a dideoxynucleotide terminator for adenine, guanine, 
cytosine or thymine. The method further includes separating the nucleic acid fragments by size and determining the 

35 nucleic acid sequence. The agents are differentiated from each other by intensity of a label in the primer extension 
products. 

While it is comnion to use gel. electrophoresis to separate DNA products of a DNA sequencing reaction, those in 
the art will recognize that other meth^^may also be used. Thus, it is possible to detect each of the different fragments 
using procedures such as time of fli^^'mass spectrometry, electron microscopy, and single molecule detection meth- 
40 ods. 

The invention also features an automated DNA sequencing apparatus having a reactor including reagents which 

provide at least two series of DNA products formed from a single primer and a DNA strand. Each DNA product of a 
series differs in molecular weight and has a chain terminating agent at one end. The reagents include a DNA polymerase 
of the present invention. The apparatus includes a separating means for separating the DNA product along one axis 

^5 of the separator to form a series of bands. It also includes a band reading means for determining the position and 
intensity of each band after separation along the axis, and a computing means that determines the DNA sequence of 
the DNA strand solely from the position and intensity of the bands along the axis and not from the wavelength of 
emission of light from any label that may be present in the separating means. . . • 

Other features and advantages of the invention will be apparent from the following description of the preferred 

50 embodiments thereof, and from the claims. 

Description of the Preferred Enibodiments , 

The drawings will first briefly be described. 
Drawings 

Figs. 1 -4 are the DNA sequences, and corresponding amino acid sequences, of FY2, FY3, and the DNA polymer- 
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ases of T. flavus and i hermus thermophilus . respectively. Figure 5 is the DNA sequence and corresponding amino 
acid sequence ot FY4. 

• Exannpies 

s ' 

The following examples serve to illustrate the DNA polymerases of the present invention and their use in sequenc- 
ing- ... 

Preparation of FY DNA Polymerases (FY2 and FY3) 

10 

Bacterial Strains 

E. CO// strains: MV1190 [A(srl - recA) 306::Tn10, A (lac-proAB), ihi, supE. F' (traD36 proAB^ lacl^ /acZ AM15)]; 
DHA.+ [gyrA96. recA1, relAt endA1, thi-h hsdR17. supE44, a*); {y!5248 [k(bio275, cl857, cIIIh-, /V+, A (H1))l 

75 

PGR 

Reaction conditions based on the procedureot Barnes (91 Proc. Nat'l. Acad. Sci. 2215-2220, 1994) were as follows: 
20mM Tricine pHS.S. 85mM KOAc. 200mM dNTPs= 10% glycerol, 5% DMSO, O.SmM each primer. 1 ,5mM MgOAc. 2.5 
20 u HotTub (Amersham Life Science Inc.) . 0.025 U DeepVent (New England Biolabs), 1-100 ng target DNA per 100ml 
reaction. Cycling conditions were 94°C 30s. 68**C 10m40s for 8 cycles; then 94*^0 30s, o8°C 12m00s for 8 cycles; then 
94°C 30s. 68°Cl3m20s for 8 cycles; then 94°C 30s. 68'*C 14m40s for 8 cycles. 

In vitro mutagenesis 

25 

Restriction enzyme digestions, plasmid preparations, and other in wYro manipuIations of DNA were performed 
using standard protocols (Sambrook et al., Molecular Cloning 2ncl Ed. Gold Spring Harbor Press. 1989). PCR (see 
protocol above) was used to introduce a Phe to Tyr amino acid change at codon 667 of native Taq DNA polymerase 
(which is codon 396 of FY2). Oligonucleotide primer 1 dGCTTGGGCAGAGGATCCGCCGGG (SEQ. ID. NO. 3) spans 

30 nucleotides 954 to 976 of the coding region of SEQ. ID. NO. 1 including a BamHI restriction site. IVIutagenic oligo primer 
2 dGGGATGGCTAGCTCCTGGGAGAGGCGGTGGGCCGACATGCCGTAG A GGACCCCGTAGTTGATGG (SEC. ID. 
NO. 4) spans nucleotides 1 178 to 1 241 including an Nhel site and codon 396 of Sequence ID. NO. 1 . A clone of exo- 
Taq deleted for the first 235 amino acids, pWB253 encoding DeltaTaq polymerase (Barnes, 112 Gene 29-35, 1992) 
was used as template DNA. Any clone of Taq polymerase or genomic DNA from Thermus aquaticus could also be 

3S utilized to amplify the identical PCR product. The PCR product was digested with BamHI and Nhel, and this fragment 
was ligated to BamHI/Nhel digested pWB253 plasmid to replace the corresponding fragment to create pWB25oY, 
encoding polymerase FY1. Cells of £. co// strain M\/1190 were used lor transformation and induction of protein ex- 
pression, although any host strain carrying a lac repressor could be substituted. DNA-sequencing verified the Phe to 
Tyr change in the coding region. 

40 PCR primer 3 dGGAATTCCATATGGACGATCTGAAGCTCTCC (SEQ. ID. NO. 5) spanning the start codon and 

containing restriction enzyme sites, was used with PCR primer 4 dGGGGTACCAAGCTTCACTCCTTGGCGGAGAG 
(SEQ. ID. NO. 6) containing restriction sites and spanning the stop codon (codon 562 of Sequence ID. NO. 1). A- 
methionine start codon and restriction enzyme recognition sequences were added to PCR primer 5 dGGAATTCCAT- 
ATGCTGGAGAGGCTTGAGTTT (SEQ. ID. NO. 7). which was used with primer 4 above. PCR was .performed using 

45 the above primer pairs, and plasmid pWB253Y as template. The PCR products were digested with restriction enzymes 
Ndel and Kpnl and ligated to Ndel/Kpnl digested vector pRE2 (Reddi "et al.^ 17 Nucleic Acids Research 10,473-10,488. 
1969) to make plasmids pRE236Y. encoding FY1 polymerase, and pRE273Y encoding FY2 polymerase, respectively 
Cells of E. CO// strain DHX+ were used for primary transformation with this and all subsequent pRE2 constructions, and 
strain M5248 (Xcl857) was used .for protein- expression, although any comparable pair of E. co// strains carrying the 

50 ci" and cl857 alleles could be utilized. Alternatively, any rec* cl* strain could be induced by chemical agents such as 
nalidixic acid to produce the polymerase. The sequences of both genes were verified. pRE273Y was found to produce 
a single polypeptide band on SDS polyacrylamide gels, unlike pRE253Y or pRE236Y 

Primer 6 dGGAATTCCATATGCTGGAACGTCTGGAGTTTGGCAGCCTC CTC (SEQ. ID. NO. 8) and primer 4 were 
used to make a PCR product introducing silent changes in codon usage of FY2. The product was digested with Ndel/ 

S5 BamHI and ligated to a pRE2 construct containing the 3' end of FY2 to create pREF Y2pref. encoding FY2 DNA polymer- 
ase. Primer? dGGAATTCCATATGGCTCTGGAACGTCTGGAGTTTGGCAGCCTCCTC (SEQ. ID. NO. 9) and primer 
4 were used to make a PCR product introducing an additional alanine codon commonly occurring at the second position 

• of highly expressed genes. The Ndel/BamHI digested fragment was used as above to create pREFYS. encoding FY 3 
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DNA polymerase. . ■ 

Preparation of FY4 DNA Poiymerase 
s Bacterial Strains 

E. CO// strains: DHU* [9yrA96, recAl, relAh endAI, thi-l, hsdR17, supE44, ]; M524e [a (bio275, cl857, cHl+, 
N-h/A(H1))]. 

10' PGR 

• Genomic DNA was prepared by standard techniques from Thermus thermophHus. The DNA polymerase gene oi 
Thermus thermophilus is known to reside on a 3 kilobase AlwNI fragment. To enrich for polymerase sequences in some 
PGR reactions, the genomic DNA was digested prior to PGR with AlwNI, and fragments of approximately 3 kb were 
'5 selected by agarose gel electrophoresis to be used as template DNA. Reaction conditions were as follows: ^OmlA Tris 
pH8.3, 50mM KCI, SOO^M dNTPs, 0.001% gelatin. l O^if^ each pnmer, 1.5mM MgClg. 2.5 U Tth, G.025 U Deepvent 
(New England Bioiabs). per ^OQ[x\ reaction. Cycling conditions were 94°C 2 min. then 35 cycles of 94**C 30s. SS'^C 
30s, 72*G 3 min. followed by 72 '^G for 7 min. 

20 /n Wfro mutagenesis 

Restriction enzyme digestions, plasmid preparations, and other in vitro manipulations of DNA were performed 
using standard protocols (Sambrook et al.. 1989). Plasmid pMRI was constructed to encode an exonuclease-free 

polymerase, with optimized codons for expression in E. cofi at the 5' end. Primer 8 (SEQ. ID. NO. 1 0) (GGAATTCCAT- 
25 ATGCTGGAAGGTCTGGAATTCGGCAGGCTC) was used with Primer 9 (SEQ. ID. N0.11 ) (GGGGTACGGTAACCCTT- 
GGCGGAAAGCG AGTG) to create a PGR product from Tth genomic DNA,, which was digested with restriction enzymes 
Ndel and Kpnl and inserted into plasmid pRE2 (Reddi etal.. 1 989. Nucleic Acids Research 17..10473- 1 0488) digested 
with the same enzymes. 

To create the desired F396Y mutation, two PGR products were made from HA? chromosomal DNA. Primer 8 above 
30 was used in combination with Primer 10 (SEQ. ID. NO. 12) (GGGATGGCTAGCTGCTGGGAGAGGCTAT- 
GGGCGG ACAT GCCGTAGAGGACGCCGTAGTTGACCG) to create a portion of the gene containing the F to Y amino 
acid change as well as a silent change to create an Nhel restriction site. Primer 11 (SEQ. ID. NO. 13) 
(CTAGCTAGCCATCGGCTA GGAAGAAGCGGTGGCCT) was used in combination with primer 9 above to create a 
portion of the gene from the introduced Nhei site to the stop codon at the 3' end of the coding sequence. The PGR 
35 product of Primers 8 and 10 was digested with Ndel and Nhel, and the PGR product of Primers 9 and 11 was digested 
with Nhel and Kphl. These were introduced into expression vector pRE2 which was digested with Ndel and Kpnl to 
produce plasmid pMR5. In addition to the desired changes. pMR5 was found to have a spurious change introduced 
by PGR, which led to an amino acid substitution. K234R. Plasmid pMRS was created to eliminate this substitution, by 
replacing the Afll l/BamHl fragment of pMRS for the corresponding fragment from pMIRl . The F Y4 polymerase encoded 
by plasmid pMR8 (SEQ. ID. NO. 14) is given in. Figure 5. 

Gelis of E. CO// strain DH1>.+ were used for primary transformation, and strain M5248 (Xcl857) was used for protein 
expression, although any comparable pair of E. co// strains carrying the cl+ and cl857 alleles could be utilized. Alter- 
natively, any rec-^ c/"^ strain could be induced by chemical agents such as nalidixic acid to produce the polymerase. 

"^5 Protein Sequencing 

Determinations of amino terminal protein sequences were performed at the W. M. Keck Foundation, Biotechnology 
Resource Laboratory, New Haven, Gonnecticut. 

50 Purification of Polymerases 

A 1 liter culture of 2X LB (2% Bacto-Tryptone, 1% Bacto- Yeast Extract. 0.5% NaCi) + 0.2% Casamino Acids + 20 
mM KPO4 pH 7.5 + 50 ^ig/'ml Ampicillin was inoculated with a glycerol stock of the appropriate cell strain and grown 
al 30*G with agitation until cells were in log phase (0.7-1 .0 OD590). 9 liters of 2X LB *+ 0.2% Casamino Acids + 20 mM 
55 KPO4 pH 7.5 + 0.05% Mazu Anti-foam was inoculated with 1 liter of log phase cells in 10 liter Microferni Fermentors 
(New Brunswick Scientific Go.). Gelis were grown at. 30°C under l5,psi pressure. 350-450 rpm agitation, and an air 
flow rate of 14,000 cc/min ±1000 cc/min. When the OD590 reached l 5-2.0. the cultures were induced by increasing 
the temperature to 40-42**G for 90-120 minutes. The cultures were then cooled 10 < 20°C and the cells harvested by 
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cenlrifugation in a Sorvall RC-3B centrifuge at 5000 rpm at 4**C for 1 5-20 minutes. Harvested cells were stored at -80'C. 

Frozen cells were broken into small pieces and resuspended in pre-warmed (90-95'*C) Lysis Buffer (20 mM Tris 
pH 3.5. 1 rr^M EDTA, io mM MgCig. 16 mt\A {NH4)2S04, O.T% Tween 20. 0.1% Nonidet P-40, 1 mM PMSF). Resus- 
pended cells were then heated rapidly to 80°C and incubated at 80°C for 20 minutes with constant stirring. The sus- 

5 pension was then rapidly cooled on ice. The cell debris was removed by centrifugation using a Sorvall GSA rotor at 
10,000 rpm for 20 minutes at 4°C. The NaCI concentration of the supernatant was adjusted to 300 mM. The sample 
was then passed through a diethyiaminoethyl cellulose (Whatman DE-52) column that had been previously equilibrated 
with Buffer A (20 mM Tris pH 8.5. 1 mM EDTA, 0.1% Tween 20, 0.1% Nonidet P-40, 300 mM NaCI. 10% glycerol. 1 
mM DTT). and polymerase collected in the flow through. The sample was then diluted to a concentration of NaCI of 

10 1 0OmM and applied to a Heparin-sepharose column. The polymerase was eluted from the column with a NaCI gradient 
(100-500 mM NaCI). The sample was then dialyzed against Buffer B (20 mM Tris pH 8.5, 1 mM EDTA, 0.1% Tween 
20, 0.1% Nonidet P-40. 10 mM KCI, 10% glycerol. 1 mM DTT) and further diluted as needed to lower the conductivity 
of the sample to the conductivity of Buffer B. The sample was then applied to a diethyiaminoethyl (Waters DEAE 15 
HR) column and eluted with a 10-500 mM KCI gradient. The polymerase was then diluted with an equal volume of 

IS Final Buffer (20 mM Tris pH 8.5, 0.1 mM EDTA, 0.5% Tween 20, 0.5% Nonidet P-40, 100 mM KCI, 50% glycerol, 1 
mM DTT) and dialyzed against Final Buffer. 

Assay of Exonuclease Activity 

20 The exonuclease assay was performed by incubating 5 ul (25-1 50 units) of DNA polymerase with 5 ug of labelled 

[2H]-pBR322 PCR fragment {^.6x^0^ cpm/ug DNA) in 100 ul of reaction buffer of 20 mM Tris-HCI pH 8.5. 5 mM MgCl2, 
10 mM KCI. for 1 hour at 50 *C. After this time interval. 200 ul of 1:1 ratio of 50 ug/ml salmon-sperm DNA with 2 mM 
EDTA and 20% TCA with 2% sodium pyrophosphate were added into the assay aliquots. The aliquots were put on ice 
for 10 min and then centrifuged at I2,000g for 10 min. Acid-soluble radioactivity in 200 ul of the supernatant was 

25 quantitaled by liquid scintillation counting. One unit of exonuclease activity was defined as the amount of enzyme that 
catalyzed the acid solubilization of 10 nmol of total nucleotide in 30 min at 60 'C, 

Utility in DNA Sequencing 

30 Example 1: DNA Sequencing with FY Polymerases (e.g., FY2 and FYS) 

The following components were added to a microcentrifuge vial (0.5 ml) : 0.4 pmol Ml 3 DNA (e.g. , Ml3mp18, 1.0 
Jig); 2 fil Reaction Buffer ( 260 mM Tris-HCI. pH 9.5 65 mM M9CI2); 2 |al of labeling nucleotide mixture (1.5 |,iM each 
of dGTP, dCTP and dTTP); 0.5 ul (5 ^Ci) of. [a-33p]cl ATP (about 2000Ci/mmol); 1 ^1 -40 primer (0.5 ^M; 0.5 pmol/'pl 

35 5'GTtTTCCCAGTCACGAC-3'); 2\x\oi a mixture containing 4 U/^\ FY polymerase and 6.6 U/ml Thermopfasma act- 
dophilum inorganic pyrophophatase (32 U/^il polymerase and S3 U/ml pyrophosphatase in- 20 mM Tris (pH8.5), 100 
mM KCI, 0.1 mM EDTA. 1 mM DTT, 0.5% NP-40, 0.5% TWEEN-20 and 50% glycerol, diluted 8 fold in dilution buffer 
(10 mM Tris-HCI pH8.0. 1 mM 2-mercaptoethanol, 0.5% TWEEN-20, 0.5% NP-40)); and water to a total volume of 
17.5 |il. These components (the labeling reaction) were mixed and the vial was placed in a constant-temperature water 

40 bath at 45''C for 5 minutes. 

Four vials were labeled A, C, G. and T and filled with 4 ^\ of the corresponding termination mix: ddA termination 
mix (150 1.1M each dATP, dCTP. dGTP. dTTP. 1.5 uM ddATP); ddT termination mix (150 ^M each dATP. dCTP. dGTP. 
dTTP. 1.5 mM ddTTP); ddC termination mix (150 each dATP. dCTP. dGTP, dTTP. 1.5 ^M ddCTP); ddG termination 
mix (150 jiM each dATR dCTP, dGTP. dTTP. 1 .5 illM ddGTP). 

45 The labeling reaction was divided equally among the four termination vials (4 |j.l to each termination reaction vial), 

and tightly capped. 

The four vials were placed in a constant-temperature water bath at 72°C for 5 minutes. Then 4 ^il of Stop Solution 
(95% Formamide 20 mM EDTA, 0.05% Bromophenol Blue. 0.05% Xylene Cyanol FF) added to each vial, and heated- 
briefly to 70°-80°C immediately prior to loading on a sequencing gel (8% acrylamide. S.3 M urea). Autoradiograms 
50 required an 18-36 hour exposure using Kodak XAR-5 film or Amersham Hyperfilm MP High-quality sequence results 
with uniform band intensities were obtained. The band intensities were much more uniform than those obtained with 
similar protocols using Taq DNA polymerase or ATaq DNA polymerase. 

Example 2: DNA Cycle Sequencing with FY Polymerases 

55 ' 

The following components were added to a microcentrifuge vial (0.5 ml) which which is suitable for insertion into 
a thermocycler machine (e.g. . Perkin-Elmer DNA Thermal Cycler): 0.05 pmol or more Ml 3 DNA (e.g. . M13mpi8. 0.1 
\xg) . or 0.1 pg double-stranded plasmid DNA (e.g, . pUCi9): 2 ^l Reaction Buffer ( 260 mM Tris-HCI. pH 9.5 65 mM 
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MgClg): 1 )al 3 0 |.lM dGTP; 1 \i\ 3.0 \iM dTTP; 0.5 ^li (5 ^Ci) of [a-33p]dATP (aboul 2000Ci/'mmol); 1 |il -40 primer (0.5 
|LiM; 0.5 pmol/^il S'Gl T I I CCCAGTCACGAC-3') ; 2 ^1 of a mixture containing 4 U/|.il FY polymerase and 6.6 U/ml 
Thermoplasma acidophilum inorganic pyrophophatase (32 U/llI polymerase and 53 U/ml pyrophosphatase in 20 mM 
Tns (pH8.5). 100 mM KCI, 0.1 mM EDTA. 1 mM DTT, 0.5% NP-40, 0.5% TWEEN-20 and 50% glycerol, diluted 8 told 
5 In dilution buffer (10 mM Tris-HCI pH8.0, 1 mM 2-mercaptoethanol, 0.5% TWEEN-20, 0.5% NP-40)); and water to a 
total volume of 17.5 

These components (labeling reaction mixture) were mixed and overlaid with 10 [i\ light mineral oil (Amersham). 
The vial was placed in the thermocycler and 30-100 cycles (more than 60 cycles is unnecessary) Irom 45°C for 1 
minute to 95°C for 0.5 minute performed. (Temperatures can be cycled from 55°-95'*C, if desired). The temperatures 
10 may be adjusted If the melting temperature of the primerAemplate is significantly higher or lower, but these temperatures 
work well for most primer-templates combinations. This step can be completed in about 3 minutes per cycle. 

Four vials were labeled A, C, G, and T and filled with 4 ml of the corresponding termination mix: ddA termination 
mix (150 tiM each dATP, dCTP. dGTR dTTP, 1.5 uM ddATP); ddT termination mix (150 |.lM each dATP. dCTR dGTP 
dTTP. 1 .5 nM ddTTP); ddC termination mix (150 }iM each dATP, dCTP, dGTP, dTTP. 1 .5 |iM ddCTP): ddG termination 
15 . mix (1 50 }iM each dATP. dCTP, dGTP, dTTP, 1 .5 |iM ddGTP). No additional enzyme is added to the termination vials. 
The enzyme carried in from the prior (labeling) step is sufficient. 

The cycled labeling reaction mixture was divided equally among the four termination vials (4 llI to each termination 
reaction vial), and overlaid with 10 |il of light mineral oil. 

the four vials were placed in the thermocycler and 30-200 cycles (more than 60 cycles is unnecessary) performed 
20 from 95^*0 for 15 seconds. 55°C for 30 seconds, and 72°C for 120 seconds. This step was conveniently completed 
overnight. Other times and temperatures are also effective. 

Six 111 of reaction mixture was removed (avoiding oil), 3iil of Stop Solution (95% Formamide 20 mM EDTA, 0.05% . 
Bromophenol Blue, 0.05% Xylene Cyanol FF) added, and heated briefly to 70°-80**C immediately prior to loading on 
a sequencing gel. Autoradtograms required an 18-36 hour exposure using Kodak XAR-5 film or Amersham Hyperfilm 
25 MP. High-quality sequence results with uniform band intensities were obtained. The band intensities were much more 
uniform than those obtained with similar protocols using Taq DNA polymerase or ATaq DNA polymerase. 

Example 3: Seauencinq with dGTP analogs to eliminate compression artifacts. 

30 For either of the sequencing methods outlined in examples 1 and 2, 7-Deaza-2*deoxy-GTP can be substituted for 

dGTP in the labeling and termination mixtures at exactly the same concentration as dGTP When this substitution is • 
made, secondary structures on the gels are greatly reduced. Similarly, 2*-deoxyinostnetriphosphate can also be sub- 
stituted for dGTP but its concentration must be 10-fold higher than the corresponding concentration of dGTP. Substi- 
tution of dITP for dGTP is even more effective in eliminating compression artifacts than 7-deaza-dGTP. 

35 

Example 4: Other Sequencing methods using FY polymerases 

FY polymerases have been adapted foi- use with many other sequencing methods, including the use of fluorescent 
primers and fluorescent-dideoxy-terminators for sequencing with the ABI 373A DNA sequencing instrument. 
40 . . . ' ■' 

Example 5: SDS-Polyacfylamide Gel Electrophoresis 

Protein samples were run on a 14 X 16 mm 7.5 or 10% polyacryiamide gel. (Gels were predominantly 10% Poly- 
acrylamide using a 14 X 16 mm Hoefer apparatus. Other sizes, apparatuses, and percentage gels are acceptable. 

45 Similar results can also be obtained using the Pharmacia Phast Gel system with SOS, 8-25% gradient gels. Reagent 
grade and ultrapure grade reagents were used.) The stacking gel consisted of 4% acrylamide (30:0.8, acrylamide: 
bisacrylamide), 125 mM Tris-HCI pH 6.8, 0.1%> Sodium Dodecy! Sulfate (SDS) . The resolving gel consisted of 7.5 or 
10% acrylamide (30:0,8, acrylamide: bisacrylamide), 375 mM Tris-HCI pH 8.8, 0.1% SDS. Running Buffer consisted 
ot 25 mM Tns, 192 mM Glycine and 0.1% SDS. IX Sample Buffer consisted of 25 mM Tris-HCI pH 6.8, 0.25% SDS, 

50 ^ 10% Glycerol, 0.1M Dithiothreitol, 0.1% Bromophenol Blue, and imM EDTA. A 1/4 volume of 5X Sample Buffer was 
added to each sample. Samples were heated In sample buffer to 90-100**C for approximately 5 minutes prior to loading. 
A. 1.5 mm thick gel was run at 50-100 mA constant current for 1-3 hours (until bromophenol. blue was close to the 
bottom of the gel). The gel was stained with 0.025% Coomassie Blue R250 in 50% methanol, 10% acetic acid and 
destained in 5% methanol. 7% acetic acid solution. A record of the gel was made by taking a photograph of the gel, 

55 by drying the gel between cellulose film sheets, or by drying the gel onto filter paper under a vacuum. 
Other embodiments are within the following claims. 
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SEQUENCE LISTING 



GENERAL INFORMATION; 



(i) APPLICANT: 



AMERSHAM LIFE SCIENCE 



(ii) TITLE OF INVENTION: 



THERMOSTABLE DNA 
POLYMERASES 



(iii) ' NUMBER OF SEQUENCES 



14 



(iv) CORRESPONDENCE ADDRESS 



(A) ADDRESSEE: 

(B) STREET: 

(C) CITY: 

(D) STATE: 

(E) COUNTRY: 

(F) . ZIP: 



Lyon Sc Lyon 

633 West Fifth Street 

Suite 4700 

Los Angeles 

California 

U.S.A. 

90071-2066 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 

(B) COMPUTER: 

(C) OPERATING SYSTEM; 

(D) - SOFTWARE: 



3.5" Diskette, 1.44 Mb 
storage 

IBM Compat ible 
IBM P.C. DOS 5.0 
Word Perfect 5 . 1 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: To Be Assigned 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(vii) • PRIOR APPLICATION DATA: 
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Prior applications total , 
including application 
described below: one 

(A) APPLICATION NUMBER: US 08/455,686 

(B) FILING DATE: May 31, 1995 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Warburg, Richard J 

(B) REGISTRATION NUMBER: 32,327^ 

(C) REFERENCE/DOCKET NUMBER: ' 219/3 04 -PCT 



(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (213) 489-1600 

(B) TELEFAX: (213) 955-0440 
. (C) TELEX: 67-3510 



(2) INFORMATION FOR SEQ ID NO: 1: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1686 base pairs 

(B) ' TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

. (ix) FEATURE: - 

(A) NAME/ KEY: FY2 

(B) LOCATION: 1 . . . 1683 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

ATG CTG GAG AGG CTT GAG TTT GGC AGC CTC CTC CAC GAG TTC GGC CTT 4 8 
Met Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe. Gly Leu 
IS 10 15 

CTG GAA AGC CCC AAG GCC CTG GAG GAG GCC CCC TGG CCC CCG CCG GAA 96 
Leu Giu iSer Pro Lys Ala Leu Glu Glu Ala Pro Trp- Pro Pro Pro Glu 
20 ..25 30 

GGG GCC TTC GTG GGC TTT GTG CTT TCC CGC AAG GAG CCC ATG TGG GCC 144 



10 



• 

EP0 745 676 A1 



10 



20 



25 



30 



Gly Ala Phe.Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala 
35 40 45 

GAT CTT CTG GCC CTG GCC GCC GCC AGG GOG ,GQC CGG GTC CAC CGG GCC 192 

Asp Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val Hi a Arg Ala 
50 55 60 

CCC GAG CCT TAT AAA GCC CTG AGG GAC CTG AAG GAG GCG CGG GGG CTT 24 0 

Pro Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu 

65 70 75 eo 



CTC GCC AAA GAC CTG AGC GTT CTG GCC CTG AGG GAA GGC CTT GGC CTC 288 

Leu Ala Lys Asp Leu Ser Val Leu Ala L€su Arg Glu Gly Leu Gly Leu 
85 90 95 

CCG CCC GGC GAC GAC CCC ATG CTC CTC GCC TAC CTC CTG GAC CCT TCC 336 

Pro Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser 
100 105 110 



AAC ACC ACC CCC GAG GGG GTG GCC CGG CGC TAC GGC GGG GAG TGG ACG 3 84 

Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr 
. 115 120 125 

GAG GAG GCG GGG GAG CGG GCC GCC CTT TCC GAG AGG CTC TTC GCC AAC 43 2 

Glu Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn 
130 135 140 

CTG TGG GGG AGG CTT GAG GGG GAG GAG AGG CTC CTT TGG CTT TAC CGG 480 

. Leu Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg 
145 150 155 160 



35 

GAG GTG GAG AGG CCC CTT TCC GCT GTC CTG GCC CAC ATG GAG GCC ACG 528 

Glu Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr 

165 * 170 175 

^0 GGG GTG CGC CTG GAC GTG GCC TAT CTC AGG GCC TTG TCC CTG GAG GTG 576 

Gly Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val 
180 185 190 

GCC GAG GAG ATC GCC CGC CTC GAG GCC GAG GTC TTC CGC CTG GCC GGC 6 24 
'^^ Ala Glu Glu lie Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly 

195 200 205 

CAC CCC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA AGG GTC CTC TTT .672 
His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu Phe 
5^ 210 215 220 

GAC GAG CTA GGG CTT CCC GCC ATC GGC AAG ACG GAG AAG ACC GGC AAG 7 20- 

Asp Glu Leu Gly Leu Pro Ala lie Gly Lys Thr Glu Lys Thr Gly Lys 

225' 230 . 235 240 

55 
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CGC TCC ACC AGC GCC GCC GTC CTG GAG GCC CTC CGC GAG GCC CAC CCC 768 

Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro 

24S 250 255 

5 • " 

ATC GTG GAG AAG ATC CTG CAG TAG CGG GAG CTC ACC AAG CTG AAG AGC 816 

lie Val Glu Lys lie Leu Gin Tyr Arg Glu Leu, Thr Lys Leu Lys Ser 
260 265 270 

10 ACC TAC ATT GAC CCC TTG CCG GAC CTC ATC CAC CCC AGG ACG GGC CGC 8 64 

Thr Tyr lie Asp Pro Leu Pro Asp Leu He His Pro Arg Thr Gly Arg 

275 ' 280 285. 

CTC CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG GGC AGG CTA AGT 912 
IS Leu His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser 
290 295 300 

AGC TCC GAT CCC AAC CTC CAG AAC ATC CCC GTC CGC ACC CCG CTT GGG 96 0 
Ser Ser Asp Pro Asn Leu Gin Asn He Pro Val Arg Thr Pro Leu Gly 

,20 305 310 315 320 

CAG AGG ATC CGC CGG GCC TTC ATC GCC GAG GAG GGG TGG CTA TTG GTG 1008 
Gin Arg lie Arg Arg Ala Phe He Ala Glu Glu Gly Trp Leu Leu Val 

325 - ■ 330 . 335 



25 



30 



35 



40 



GCC CTG GAC TAT AGC CAG ATA GAG CTC AGG GTG CTG GCC CAC CTC TCC 1056 
Ala Leu Asp Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Leu Ser 

340 345 ' 350 

GGC GAC GAG AAC CTG ATC CGG GTC TTC CAG GAG GGG CGG GAC ATC CAC 1104 
Gly Asp Glu Asn Leii He Arg Val Phe Gin Glu Gly Arg Asp He His 
355 360 365 

ACG GAG ACC GCC AGC TGG ATG TTC GGC GTC CCC CGG GAG GCC GTG GAC 1152 
Thr Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp 
370 375 380 

CCC CTG ATG CGC CGG GCG GCC AAG ACC ATC AAC TAC GGG GTC CTC TAC 1200 
Pro Leu Met Arg Arg Ala Ala Lys Thr He Asn Tyr Gly Val Leu Tyr 
385 390 395 400 



GGC ATG TCG GCC CAC . CGC CTC TCC CAG GAG CTA GCC ATC CCT TAC GAG 124 8 

Gly Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro Tyr Glu 
45 405 410 415 

GAG GCC CAG GCC TTC ATT GAG CGC TAC TTT CAG- AGC TTC CCC AAG GTG 1296 

Glu Ala Gin Ala Phe He Glu Arg Tyr Phe Gin Ser Phe Pro. Lys Val 

420 425 430 

SO 



CGG GCC TGG. ATT GAG AAG ACC CTG GAG GAG GGC AGG AGG CGG GGG TAC 13 4 4 
Arg Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly .Tyr 
55 435 440 445 
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GTG GAG ^ACC CTC TTC GGC COC CGC CGC TAG GTG CCA GAC CTA GAG GCC 13 92 

Val Giu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala 

450 455 460 

5 

CGG GTG AAG AGC GTG CGG GAG GCG GCC GAG CGC ATG GCC TTC AAC ATG 144 0 

Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met 

465 470 475 480 

10 ' CCC GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTG GCT ATG GTG AAG 14 88 

Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys 

465 490 495 

CTC TTC CCC AGG CTG GAG GAA ATG .GGG GCC AGG ATG CTC CTT CAG GTC 153 6 

75 . Leu Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gin Val 

500 505 510 



20 



25 



30 



35 



40 



45 



50 



55 



CAG GAC GAG CTG GTC CTC GAG GCC CCA AAA GAG AGG GCG GAG GCC GTG 158 4 

His Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val 

515 520 * 525 

GCC CGG CTG GCC AAG GAG GTC ATG GAG GGG GTG TAT CCC CTG GCC GTG . 163 2 

Ala Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val 

530 535 540 

CCC CTG GAG GTG GAG GTG GGG ATA GGG GAG GAC TGG CTC TCC GCC AAG 1680 

Pro Leu Glu Val Glu Val Gly lie Gly Glu Asp Trp Leu Ser Ala Lys 

545 550 555 560 

GAG. TGA If 86 

Glu * - 



(2) INFORMATION FOR SEQ ID NO : 2: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 8 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) . TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: FY3 

(B) LOCATION: 1...1686 

(xi) SEQUENCE DESCRIPTION: SEQ I D NO : 2: 
ATG GCT CTG GAA CGT CTG GAG TTT GGC AGC CTC CTC CAC GAG TTC GGC 4 8 
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10 



15 



20 



25 



30 



40 



Met^ Ala Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly 

1,5' 10 15 

CTT CTG GAA AGC CCC AAG GCC CTG GAG GAG GCC CCC TGG CCC CCG CCG 96 

Leu Leu Glu Ser Pro Lys Ala Leu Glu Glu -Ala Pro Trp Pro Pro Pro 
20 25 30 

GAA GGG GCC TTC GTG GGC TTT GTG CTT TCC CGC AAG GAG CCC ATG TGG 144 

Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp 
35 40 45 



GCC GAT CTT CTG GCC CTG GCC GCC GCC AGG GGG GGC CGG GTC CAC CGG 192 

Ala Asp Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg 

SO 55 60 ' ' 

GCC CCC GAG CCT TAT AAA GCC CTC AGG GAC CTG AAG GAG GCG CGG GGG 240 

Ala Pro Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly 
65 70 75 80 

CTT CTC GCC AAA GAC CTG AGC GTT CTG GCC CTG AGG GAA GGC CTT GGC 2 88 

Leu Leu Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly 
85 90 95 

CTC CCG CCC GGC GAC GAC CCC ATG CTC CTC GCC TAC CTC CTG GAC CCT 3 36 

Leu Pro Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro 
100 105 110 

TCC AAC ACC ACC CCC GAG GGG GTG GCC . CGG CGC TAC GGC GGG GAG TGG 3 84 

Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp 

115 120 125 

35 ACG GAG GAG GCG GGG GAG CGG GCC GCC CTT TCC GAG AGG CTC TTC GCC 4 32 

Thr Glu Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala 

130 135 140 



AAC CTG TGG GGG AGG CTT GAG GGG GAG GAG AGG CTC CTT TGG CTT TAG 4 80 
Asn Leu Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr 
145 150 155 160 



CGG GAG GTG GAG AGG CCC CTT TCC GCT GTC CTG GCC CAC ATG GAG GCC 52 8 
Arg Glu Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala 
165 170 175 



ACG GGG GTG CGC CTG GAC GTG GCC TAT CTC AGG GCC TTG TCC CTG GAG 576 

Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu 

180 185 190 

SO- 

GTG GCC GAG GAG ATC GCC CGC CTC GAG GCC GAG GTC TTC CGC CTG GCC 624 

Val Ala Glu Glu lie Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala 

195 200 ^ 205 

GGC CAC CCC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA AGG GTC CTC 6 72 
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70 



15 



20 



25 



30 



35 



Gly His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg *Val Leu 
210 ' 215 220 

TTT GAC GAG CTA GGG CTT CCC GCC ATC GGC AAG ACG GAG AAG ACC GGC 720 

Phe Asp Glu Leu Gly Leu Pro Ala lie Gly Lys Thr Glu Lys Thr Gly 
225 230 - 235 240 

•AAG GGC TCC ACC AGC GCC GCC GTC CTG GAG GCC CTC CGC GAG GCC CAC 7 68 

Lys Arg Ser Thr Ser. Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His 

245 250 255 

CCC ATC GTG GAG AAG ATC CTG CAG TAC CGG GAG CTC ACC AAG CTG AAG 816 

Pro lie Val Glu Lys lie Leu Gin Tyr Arg Glu Leu Thr Lys Leu Lys 
260 265 270 

AGC ACC TAC ATT GAC CCC TTG CCG GAC CTC ATC CAC CCC AGG ACG GGC 8 64 

Ser Thr Tyr lie Asp Pro Leu Pro Asp Leu lie His Pro Arg Thr Gly 
275 280 285 

CGC CTC CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG GGC AGG CTA 912 

Arg Leu His Thr' Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu 
290 295 300 

AGT AGC TCC GAT CCC AAC CTC CAG AAC ATC CCC GTC CGC ACC CCG CTT" 9 60 

Ser Ser Ser Asp Pro Asn Leu Gin Asn lie Pro Val Arg Thr Pro Leu 
305 310 315 320 



GGG CAG AGG ATC CGC CGG GCC TTC ATC GCC GAG GAG GGG TGG CTA TTG 1008 
Gly Gin Arg lie Arg Arg Ala Phe lie Ala Glu Glu Gly Trp Leu ' Leu 
325 330 335 

GTG GCC CTG' GAC TAT AGC CAG ATA GAG CTC AGG GTG CTG GCC CAC CTC 1056 
Val Ala Leu Asp Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu 
340 345 '350 

40 TCC GGC GAC GAG AAC CTG ATC CGG GTC TTC CAG GAG GGG CGG GAC ATC 1104 

Ser Gly Asp Glu Asn Leu lie Arg Val Phe Gin Glu Gly Arg Asp He 
355 360 365 

CAC ACG GAG ACC GCC AGC TGG ATG TTC GGC GTC CCC CGG GAG GCC GTG 1152 
^5 His Thr Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val 

370 375 380 

GAC CCC CTG ATG CGC CGG GCG GCC AAG ACC ATC AAC TAC GGG GTC CTC 12 00 
Asp Pro Leu Met Arg Arg Ala Ala Lys Thr He'- Asn Tyr Gly Val Leu 
50 395 390 395 400 

TAC GGC ATG TCG GCC CAC CGC CTC TCC CAG GAG CTA GCC ATC CCT TAC 12 4 8 

Tyr Gly, Met Ser Ala His Arg Leu Ser Gin Glu. Leu Ala He Pro Tyr 
405 410 415 

55 
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10 



IS 



40 



45 



SO 



ss 



GAG GAG GCC CAG GCC TTC ATT GAG CGC TAG TTT CAG AGO TTC CCC AAG 12 96 

Glu Glu Ala Gin Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys 
420 425 430 

GTG CGG GCC TGG ATT GAG AAG ACC CTG GAG GAG GGC AGG AGG CGG GGG 1344 

Val Arg Ala Trp lie Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly 
, 43 5 44 0 44 5 

TAC GTG GAG ACC CTC TTC GGC CGC CGC CGC TAC GTG CCA. G AC CTA GAG 13 92 

Tyr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu 

450 455 460 . 

GCC CGG GTG AAG AG C .GTG CGG GAG GCG GCC GAG CGC ATG GCC TTC AAC 144 0 

Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn 
465 470 475 480 



ATG CCC GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTG GCT ATG GTG 14 8 8 

20 Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val 

485 490 495 

AAG CTC TTC . CCC AGG CTG GAG GAA ATG GGG GCC AGG ATG CTC CTT CAG 153 6 

Lys Leu Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gin 
2S . 500 505 . 510 

GTC CAC GAC GAG CTG GTC CTC GAG GCC CCA AAA GAG AGG GCG GAG GCC 158 4 

Val His Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala 

515 ~ 520 525 

30 

GTG GCC CGG CTG GCC AAG GAG GTC ATG GAG GGG GTG TAT CCC CTG GCC . 163 2 

Val Ala Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala . 
530 535 540 

GTG CCC CTG GAG GTG GAG GTG GGG ATA GGG GAG GAC TGG CTC TCC GCC 1680 
Val Pro Leu Glu Val Glu Val' Gly He Gly Glu Asp Trp Leu Ser Ala 
545 550 , 555 560 



AAG GAG TGA 168 9 

Lys Glu * 



(2) INFORMATION FOR SEQ ID NO: 3: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: • linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
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■5 



- 10 



25 



30 



35 



50 



GCTTGGGCAG AGGATCCGCC GGG ^ 

(2) INFORMATION FOR SEQ ID NO : 4: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

20 GGGATGGCTA GCTCCTGGGA GAGGCGGTGG GCCGACATGC CGTAGAGGAC . ' 50 

CCCGTAGTTG ATGG 64 



(2) INFORMATION FOR SEQ ID NO: 5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE. DESCRIPTION: SEQ ID NO: 5: 
GGAATTCCAT ATGGACGATC TGAAGCTCTC C 31 

40 

(2) INFORMATION FOR SEQ ID NO: 6: 
45 *(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



55 
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GGGGTACCAA GCTTCACTCC TTGGCGGAGA G 

(2) INFORMATION FOR SEQ ID NO: 7: 
(i) SEQUENCE. CHARACTERISTICS: 

(A) . LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY:'. - linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGAATTCCAT ATGCTGGAGA GGCTTGAGTT T 



(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:' 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGAATTCCAT ATGCTGGAAC GTCTGGAGTT TGGCAGCCTC CTC 



(2) INFORMATION FOR SEQ ID NO: 9:' 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS.: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
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GG AATTC CAT ATGGCTCTGG AACGTCTGGA GTTTGGCAGC CTCCTC 

(2) INFORMATION FOR SEQ ID NO : • 10: '- 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

.(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: , linear 

■ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GGAATTCCAT ATGCTGGAAC GTCTGGAATT CGGCAGCCTC 

(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS :_ 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GGGGTACCCT AACCCTTGGC GGAAAGCCAG TC 

(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GGGATGGCTA GCTCCTGGGA GAGCCTATGG GCGGACATGC CGTAGAGGAC 
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GCCGTAGTTC ACCG 

(2.) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRAjNDEDNESS : , single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION:. SEQ ID NO: 13 
CTAGCTA-GCC ATCCCCTACG AAGAAGCGGT GGCCT 



(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 16 8 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) . TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: FY4 

(B) LOCATION: 1.,.16 8 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 : . 

ATG CTG GAA CGT CTG GAA TTC GGC AGC CTC CTC CAC GA<3 TTC GGC CTC 
Met Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu 
1 5 10 15 

CTG GAG GCC CCC GCC CCC CTG GAG GAG GCC CCC TGG CCC CCG CCG GAA 
Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu 
20 25 , 30 

GGG GCC TTC GTG GGC TTC GTC CTC TCC CGC CCC GAG CCC ATG TGG . GCG 
Gly Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro Met Trp Ala 
35 40 45 



20 



EP b 745 676 A1 



GAG CTT AAA GCC CTG GCC GCC TGC AGG GAC GGC CGG GTG CAC CGG GCA 192 

Glu Leu Lys Ala Leu Ala Ala Cys Arg Asp Gly Arg Vai His Arg Ala 
50 55 60 

5 

GCA GAC CCC TTG GCG GGG CTA AAG GAC CTC AAG GAG GTC CGG GGC CTC 24 0 

Ala Asp Pro Leu. Ala Gly Leu Lys Asp Leu Lys Glu Val Arg Gly Leu 

65 70 75 80 

10 CTC GCC AAG GAC CTC GCC GTC TTG GCC TCG AGG GAG GGG CTA GAC CTC 28 8 

Leu Ala Lys Asp. Leu Ala Val Leu Ala Ser Arg Glu Gly Leu Asp Leu 

85 90 . 95 



75 



20 



25 



30 



GTG CCC GGG GAC GAC CCC ATG CTC CTC GCC TAC CTC CTG GAC CCC TCC . 33 6 

^Val Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser 
100 105 ■ 110 

AAC ACC ACC CCC GAG GGG GTG GCG CGG CGC TAC GGG GGG" GAG TGG ACG ■ - 384 
Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr 
115 120 125 

GAG GAC GCC GCC CAC CGG GCC CTC CTC TCG GAG AGG CTC CAT CGG AAC 43 2 

Glu Asp Ala Ala His Arg Ala Leu Leu Ser Glu Arg Leu His Arc Asn 
130 135 140 ■ 

CTC CTT AAG ^ CGC CTC GAG GGG GAG GAG AAG CTC CTT TGG CTC TAC CAC 480 
Leu Leu Lys Arg Leu Glu Gly Glu Glu Lys Leu Leu Trp Leu Tyr His 
145 150 155 160 

GAG GTG GAA AAG CCC CTC TCC CGG GTC CTG GCC CAC ATG GAG GCC ACC 52 8 

Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala Thr 
• 165 170 175 

■ GGG GTA CGG CTG GAC GTG GCC TAC CTT CAG GCC CTT TCC CTG GAG CTT 57 5 

Gly Val Arg Leu Asp Vai Ala Tyr Leu Gin Ala Leu Ser Leu Glu Leu 
180 185 190 

GCG GAG GAG ATC CGC CGC CTC GAG GAG GAG GTC TTC CGC TTG GCG GGC 624 
Ala Glu Glu lie Arg Arg Leu Glu Glu Glu Val -Phe Arg Leu Ala Gly 
195 200 205 

CAC CCC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA AGG GTG CTC TTT 6 72 

His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu Phe 
210 215 220 - " ' 

GAC GAG CTT AGG CTT CCC C5CC TTG GGG AAG ACG CAA AAG AC A GGC AAG 720 
Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin Lys Thr Gly Lys 
225 230 235 ■ 240 

CGC TCC ACC AGC GCC GGG GTG CTG GAG GCC CTA CGG GAG GCC CAC CCC 76 8 

Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro 
245 250 255 

55 ATC GTG GAG AAG ATC CTC CAG CAC CGG GAG CTC ACC AAG CTC AAG AAC 816 



35 



40 



45 



50 
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10 



20 



25 



30 



3S 



40 



45 



SO 



lie Val Glu Lys lie Leva Gin His Arg Glu Leu Thr Lys Leu Lys Asn 
260 265 . 270 

ACQ TAG GTG.GAC CCC CTC CCA AGC CTC GTC CAC CCG AGG ACG GGC CGC 864 

Thr. Tyr Val Asp Pro Leu Pro Ser Leu Val His Pro Arg Thr Gly Arg 
275 280 285 

CTC CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG GGG AGG CTT AGT 912 
Leu His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser 
290 295 300 



960 



■ AGC TCC GAC CCC AAC CTG CAG AAC ATC CCC GTC CGC ACC CCC TTG GGC 
Ser Ser. Asp Pro Asn Leu Gin Asn lie Pro Val Arg Thr Pro Leu Gly 
75 310 315 320 

CAG AGG ATC>CGC CGG GCC TTC GTG GCC GAG GCG GGT TGG GCG TTG GTG 100 8 

Gin Arg He Arg Arg Ala Phe Val Ala Glu Ala Gly Trp Ala Leu Val 
325 " . 330 335 



.GCC CTG GAC TAT AGC CAG ATA GAG CTC CGC. GTC CTC GCC CAC CTC TCC 1056 
Ala 'Leu Asp Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Leu Ser 
340 345 '. 350 

GGG GAC GAA AAC CTG . ATC AGG GTC TTC CAG GAG GGG AAG GAC ATC CAC . 1104 
Gly Asp Glu Asn Leu He Arg Val Phe Glii Glu Gly Lys. Asp lie His 
355 360 365 

ACC CAG ACC GCA AGC TGG ATG TTC GGC GTC CCC CCG GAG GCC GTG GAC 1152 
Thr Gin Thr Ala Ser Trp Met Phe Gly Val Pro Pro Glu Ala Val Asp 
370 375 380 

CCC CTG ATG CGC CGG GCG GCC AAG ACG GTG AAC TAC GGC GTC CTC TAC 1200 
Pro Leu Met- Arg Arg Ala Ala Lys Thr Val Asn Tyr Gly Val Leu Tyr 
385 390 . 395 400 

GGC ATG TCC GCC CAT AGG CTC TCC CAG GAG CTA GCC ATC CCC TAC GAA 124 8 

Gly Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro .Tyr Glu 
405 410 415- 

GAA GCG GTG GCC TTT ATA GAG CGC TAC TTC CAA AGC TTC CCC AAG GTG 12 96 

Glu Ala Val Ala Phe' He Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val 
420 425 430 

CGG GCC TGG ATA GAA AAG ACC CTG GAG GAG GGG AGG AAG CGG GGC TAC 13 4 4 

Arg Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg Lys Arg Gly Tyr 
435 440 445 

GTG GAA ACC CTC TTC GGA AGA AGG CGC TAC GTG CCC GAC CTC AAC GCC 13 9 2 

Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala 
450 455 460 

CGG GTG AAG AGC GTC AGG GAG GCC GCG GAG CGC ATG GCC TTC AAC ATG 144 0 

Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met 
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465 470 475 480 

CCC GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTC GCC ATG GTG AAG 14 8 8 

s Pro Val Gin Gly Thr Ala Ala Asp Leu Met Ijys Leu Ala Met Val Lys 

485 490 495 

CTC TTC CCC CGC CTC COG GAG ATG GGG GCC CGC ATG CTC CTC CAG GTC 153 6 

Leu Phe Pro Arg Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gin Val 

10 500 505 ' 510. 

CAC GAC GAG CTC CTC CTG GAG GCC CCC CAA GCG CGG GCC GAG GAG GTG 15 84 

His Asp Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg Ala Glu Glu Val 
515 ■ 520 525 



75 



20 



25 



SO 



55 



GCG GCT TTG GCC AAG GAG GCC ATG GAG AAG GCC TAT CCC CTC GCC GTG 163 2 

Ala Ala Leu Ala Lys Glu Ala Met Glu Lys Ala Tyr Pro Leu Ala Val 
530 535 540 



CCC CTG GAG GTG GAG GTG GGG ATG GGG GAG GAC TGG CTT TCC GCC AAG 16 80 

Pro Leu Glu Val Glu Val Gly Met Gly Glu Asp Trp Leu Ser Ala Lys 
545 550 555 560 

GCT TAG 1686 
Gly * . 
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Claims 

35 1 . An enzymatically active DN A polymerase having between 540 and 582 amino acids having a tyrosine at a position 
equivalent to position 667 of Taq DNA polymerase, wherein said polymerase lacks 5' to 3' exonuclease activity, 
and wherein said polymerase has at least 95% homology in its amino acid* sequence toMhe DNA polymerase of 
Thermus aquaticus . Thermus flavus or Therm us thermophiius . and wherein said polymerase forms a single 
polypeptide band or an SDS polyacrylamide gel. 

40 . 

2. The polymerase of claim 1 wherein the amino acid sequence of said polymerase includes less than 3 conservative 
amino acid changes compared to one said DNA polymerase of said named Thermus species. 

3. The polymerase of claim 1 wherein the amino acid sequence of said polymerase includes less than 3 additional' 
•^5 amino acids compared to one said DNA polymerase of said named Thermus species al its N-terminus. 

4. The polymerase of claim 1 selected from the group consisting of FY2. FY3 and FY4. 

5. Purified nucleic acid encoding the DNA polymerase of any of claims 1-4. 



6. Method for sequencing DNA comprising the step of generating chain terminated fragments from the~DNA template 
to be sequenced with a DNA polymerase of any of claims 1-4 in the presence of at least one chain terminating 
agent and one or more nucleotide triphosphates, and determining the sequence of said DNA from the sizes of said 

-fragments. 

7. Kit for sequencing DNA comprising a DNA polymerase of any of claims 1 -4 and a pyrophosphatase. 

8. The kit of claim 7 wherein said pyrophosphatase is thermostable. 
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Apparatus tor DNA sequencing having a reactor comprising a DNA polymerase of any of claims T-4 and a band 
separator. 
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FIG. 1 
(sheec 1) 



^DMA sequence 1666 b.p. acgccggagagg ... gccaaggagcga linear 

'f 

1/1 31/11 

acg ccg gag agg cct gag .cct ggc age etc etc cac gag ccc ggc etc' ccg gaa age ccc 
MLERLE.FGSLLHEFGLLESP 
61/21 91/31 

aag gcc ctg gag gag gcc ccc tgg ccc ccg ccg gaa otO gcc etc gtg ggc ccc gtg etc 
KALEEAPW PPPEGAFVGFVL 
121/41 ISl/Sl 
. CCC cgc aag gag ccc acg Cog gcc gac etc ctg gcc ccg gcc gcc gcc agg ggg ggc egg 
SRKE PMWADLLALAAARGGR 
181/61 '211/71 

gcc cac egg gcc ccc gag ccc cac aaa gcc ccc agg gac ccg aag gag gcg egg ggg etc 
VHRAPEPYKALRDLKEARGL 
241/81 271/91 

ccc gcc aaa gac ctg age gcc ccg gcc ccg agg gaa ggc etc ggc, ccc ccg ccc ggc gac 
LAK DLSVLALR EGLGLP PGD 
JOl/101 331/111 

gac ccc atg etc etc gcc cac ccc ccg gac ect ccc aac^acc acc ccc gag ggg gcg gcc 
D PMLLA Y L.LD P S NTT PEGVA 
361/121 391/131 

egg cgc cac ggc ggg gag egg acg gag gag gcg ggg gag egg gcc gcc ccc ccc gag agg 
RRyCCEWTE EAG ERAALSER 
421/141 451/151 

ccc ccc gcc aac ccg egg ggg agg ccc gag ggg gag gag agg etc ccc egg ccc cac egg 
LFANLVJGR LEG E ER L LW LY R 
431/161 511/171 

gag gcg gag agg ccc ccc ccc gcc gcc ccg gcc cac acg gag gcc acg ggg gcg cgc, ccg 
E V ER P LSAV L A H M E ATG'vR L 
541/181 571/191 

gac gcg gcc cac ccc agg gcc ccg . ccc ctg gag gcg gcc gag gag acc gee cgc ccc gag 

dv aylralslevaee 'iarle 

601/201 631/211 

gcc gag gcc tec cgc ccg gcc ggc cac ccc ccc aac ccc aac Ccc egg gac cag ccg. gaa 
A E V.FR L A GH P F.N I N S RDQL £ 
661/221 ' 691/231 

agg gcc ccc ccc gac gag cca ggg ccc ccc gcc acc ggc' aag acg gag aag acc ggc aag 
R V L FD.E L G L P A I G K*T EK TG K 
721/241 ' 7S1/251 

cgc ccc acc age gcc gcc gcc ccg gag gcc etc cgc gag gcc cac ccc acc gcg gag aag 
RSTSAAVLEALR EAH P. I VEK 
781/261 811/271 

acc ctg cag tac egg gag etc acc aag ccg aag age acc cac acc gac ccc ccg ccg gac 
I LQYR EL TK LK S TY I OP LPD 
641/261 871/291 

etc ace cac ccc agg acg ggc cgc ccc cac acc cgc ccc aac cag acg gcc acg gee acg 
LI HP RTGRLHTR FNQTATAT 
901/301 931/311 

ggc agg cca age age ccc gac ccc aac ccc cag aac acc ccc gcc cgc acc ccg ccc ggg 
GRLSSSDP NLON I PV RTP LG 
9&1/321 991/331 

cag agg acc cgc egg gcc ccc acc gcc gag gag ggg egg cca ccg gcg gcc ccg gac cac 
OR IR R A F lA E E C W L L VA LD Y 
1021/341 10SI/3S1 

age cag aca gag ccc agg gcg ccg gcc cac ccc ccc ggc gac gag aac ccg acc egg gcc 
SQIELRVLAHLSGOENLIRV 
1081/361 1111/371 

ccc cag gag ggg cgc gac acc cac acg gag acc gcc age egg atg ccc ggc gcc ccc egg 
F-QEGROI HTETASWM rCVPR 
' 1141/381 1171/291 

gag gcc gcg gac ccc ccg acg cgc egg gcg gcc aag acc acc aac cac ggg gcc ccc cac 
EAVOPLM.RRAA KTIN YCV LY 
1201/401 1231/411 

ggc atg ecg gcc cac cgc etc ccc cag gag cca gcc acc cce c^c gag gag gcc cag gee 
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1261/421 

CCC att gag cgc tac cut cag age tcc ccc 
FIERY FQSFP 

gag gag ggc agg agg egg ggg cac gtg gag 
5|,E G R n R G Y V E 
1381/461 

gac eta gag gcc egg gCg aag age gtg egg 

DLEAR.VKSVR 

1441/481 

ecc gcc cag ggc acc gcc gcc gac ccc a eg 
PV QQTAAD .LM 
1501/501 

ctg gag gaa atg ggg gcc agg acg etc etc 

1561/521 

eca aaa gag agg gcg gag gcc gcg gcc egg 

PKERAEAVAR 

1621/541 

ccc ctg gcc gtg ccc ctg gag gtg gag gtg 
PLAVPLEVE V 

1681/561 . 
gag tga 
E 



1291/431 

aag gcg egg gcc tgg att gag aag acc ctg 
K V.R A W I E K T L- 
1351/451 

acc etc tte ggc cgc cgc cgc tac gcg cca 
TLFGR RR YV P 
1411/471 

gag gcg gee gag cgc atg gee tte aac atg 
EAAERM AFNM 
1471/491 

aag ctg get acg gtg aag etc tte ecc agg 
KLAMVK LFPR 
1531/511 

cag gtc cac gac gag ctg gtc etc gag gcc 
QVHDELVLEA 

1S91/S31 

ctg gcc aag gag gte atg gag ggg gtg tat 
L A K E V M E G V Y 

1651/S51 

ggg aca ggg gag gac tgg etc tcc gcc aag 
GIGEDWLS AK 



FIG. 1 

(shee-c 2) 
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FIG. 2- 
Csheec L 



^ONA se^ence 1689 b.p. atggcectggaa . . . gccaaggagcga linear 

> 

l/l • 31/11 

atg gcC ccg gaa cgc ctg gag ccc ggc age ccc ccc cac gag tec ggc cct ctg gaa age 
MALERL CrGSLLllEPGLLES 
€1/21 91/31 

ccc aag gcc ctg gag gag gee ccc egg ccc ccg ccg gaa ggg gcc ccc gcg ggc (Ccc gcg 
PKKLEEAPW P P.P EGA FVGFV 
12X/-I1 151/Sl 

ccc ccc cgc aag gag ccc acg egg gcc gac ccc ceg gcc ccg gcc gcc gcc agg ggg ggc 
LS RKE PMWA DLLALAAARGG 

181/61 211/71 

egg gtc cac egg gcc ccc gag ccc caC aaa gcc ccc agg gac ccg aag gag gcg egg ggg . 
RVH RAPEPYKALRDLKEARG 
241/81 .271/91 

CCC ccc gcc aaa gac ccg age gcc cCg gcc ccg agg gaa ggc ccc ggc ccc ccg ccc ggc 
LLA. KDL .SVL A LR EGLGLP PC 
301/101 331/111 
• gac gac ccc acg ccc ccc gcc cac ccc ceg gac ccc ccc aac acc acc ccc gag ggg gcg 
D D P M L.L A Y L L D P S rJ T T P E G V 
361/121 ^ 391/L31 

gcc egg cgc tac ggc ggg gag egg acg gag gag gcg ggg gag egg gcc gcc cee ccc gag 
ARRYGGEWTE EAGERAALSE 
421/141 451/151 

agg ccc ccc gcc aac ccg egg ggg agg cec gag ggg gag gag agg ccc ccc egg ccc cac 
RL FA NLWGR L EG EERLLW LY 
481/161 511/ni 

egg gag gcg gag agg ccc cCC ccc get gCc ccg gcc cac atg gag gcc acg ggg gcg cgc 
RE V ERPLSAV LAIIMEA TGVR 
S41/101 S71/191 

ccg gac gcg gcc eac cCc agg gcc ccg ccc ccg gag gcg gcc gag gag acc gcc cgc ccc 
LDVA YLRALS LEVAEEIARL 
601/201 631/211 

gag gcc gag gCc ccc cgc ccg gcc ggc cac ccc etc aac ccc aac tec egg gac cag ccg 
EAEVF RLAGH PFNLNSR DQL 
661/221 691/231 

gaa agg gee ccc ccc gac gag cca ggg ccc ccc gcc acc ggc aag acg gag aag acc ggc 
ERV LFDELC L PA IGKTEKTG 
721/2/11 751/2S1 

aag cgc Ccc acc age gcc gcc gcc ccg gag gcc etc cgc gag gcc cac ccc acc gcg gag 
K RSTSAAV U E ALREA. U P T V E 
781/261 811/271 

aag acc ccg cag cac egg gag ccc acc aag ccg aag age acc cac act gac ccc ccg ccg 
K r LQYRELTK LKSTY lOPLP. 
841/281 b-7i/2Sl 

gac ccc ace cac ccc agg acg ggc cgc ccc cac acc cgc ccc aac cag acg gcc acg 'gcc 
D L I 11 P'R TC R^L HT R FN.QT A T A. 
901/301 . 931/311 

acg ggc agg cea agC age ecc gac ccc aac cec cag aac acc ccc gcc cgc acc ccg etc 
TC R LSSSD P K LQN t PVRT.P L 
961/321 991/331 

ggg cag agg acc cgc egg gcc ccc acc gcc gag gag ggg egg cca ccg gcg gcc ccg gac 
con IR RAFI A E.ECWLLVALD 
1021/341 10S1/3S1 

cac age cag aca gag ccc agg' gcg ccg gcc cac ccc ccc ggc gac gag aac ccg acc .egg 

Y S 0 I ELRV L A ML S G.O EN L I R 
1081/361 1111/371 

gcc ccc cag gag ggg egg gac ate cac acg gag acc gcc age egg acg etc ggc gcc ccc 

V FOEGROI H T ETaSW-HJ-'GV P 
1141/381 1171/391 

egg gag gcc gCg gac ccc ccg acg cgc egg qcg gcc aag acc ate aac cac ggg gcc ccc 
REAVOPl. MRRAAKTINYCVL 
1201/401 1231/111 

cac ggc acg ccg gcc cac cgc ccc ccc cag gag cca gcc acc ccc cac gag gay gcc cag 
■ Y C M S A U ie L S O £ L A I P y E E A 0 
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1261/421 

ucc ttc act gag cgc Cac CCC cag age CCc 

AFIERYF QSF 
1321/441 

ftCg gag gag ggc agg agg egg ggg cac gcg 
•UEEGRRRGYV 
r381/461 

cca gac cca gag gee egg gcg aag age gcg 

PDLE ARVKSV 
1441/481 

a eg ccc gcc cag ggc acc gcc gcc gac ccc 
mpv qgtaadl 

1501/501 

agg cCg gag gaa acg ggg gcc agg acg etc 
R L E E M G A R M L 

1561/521 

gcc cca aaa gag agg gcg gag gcc gcg gcc 

APKERAEAVA 

1621/S41 

tat ccc ccg gcc gcg ccc cCg gag gcg gag 

YPLAVPLEVE 

1681/561 

aag gag tga 

K E * 



1291/431 

ecc aag gcg egg gcc Cgg acc gag aag acc 
PKV RAWI EKT 

13S1/451 

gag acc etc ccc ggc cgc cgc cgc cac gcg 
ETLF GRRRYV 
1411/471 

egg gag gcg gcc gag cgc acg gcc ccc aac . 

REAAERMAPN 

1471/491 

acg aag cCg gcc aCg gcg aag cCc tec ccc 
M K L A M V K L F P 
1531/511 

etc cag gtc cac gac gag ccg gcc ccc gag 
LQVH DELVLE 
1591/531 

egg ccg gcc aag gag gcc acg gag ggg gcg 

RLAKEVMEGV 

1651/551 

Qtg ggg aCa ggg gag gac Cgg ccc tec gcc 
VGIGEDWLS A 



FIG. 2 
(sheet 2) 
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DMA sequence 2496 b.p. aCggcoaCgctc ... gccaaggagCag linear 

! 

L/1 31/11 

atg gcg acg etc ccc ccc tec gag ccc aaa ggc cgc gtg etc czg gcg gac ggc cac cac 

MAMLPLFE PKGRVLLVD-GMH 
fil/21 91/31 

ccg gcc tac cgc ace etc etc gcc ccc aag ggc ccc acc acc age cgc ggc gaa ccc gcc 
LAYRTFFALKGLTTS.RGEPV 

121/41 151/Sl 

cag gcg gtc cac ggc etc gcc aaa age etc etc aag gcc ccg aag gag gac ggg gac gtg 
QAVYGF. AKSLLKALKEOGD.V 

181/61 211/71 

gcg gcg gCg gcc ccc gac gcc aag gee ccc ccc ccc cgc cac gag gcc tac gag gcc cac 
VVVV F DA K A P S FRH EAY EA Y 
241/81 271/91 

aag gcg ggc egg gcc ccc acc ccg gag gac tec ccc egg cag cCg gcc ccc acc aag gag 
K AGRAPTPEDFPRQLALIKE 
301/101 331/111 

ccg gcg gac etc cca ggc etc gtg egg ctg gag get ccc ggc ctt gag gcg gac gac gcg 
L VDLLGLV RLEV PGFEADDV 
361/121 391/131 

ccg gcc acc ctg gcc aag egg gcg gaa aag gag ggg tac gag gtg cgc acc ccc acc gcc 
LATLAKR AEKEGYEVRI LTA 
421/141 451/151 

gac cgc gac etc cac cag ccc ccc ccg gag cgc acc gcc ace ccc cac -ccC gag ggg eac 
D RDLY, QL LS E R I A I LHP.EG Y 
4B1/161 Sll/171 

ceg ace acc ccg gcg egg ccc cac gag aag cac ggc ccg cgc ccg gag cag egg gcg goc 
LiTPAVJLYEKYGLRPEQWVO 
541/181 " 571/191 

tac ogg gcc ctg gcg ggg gac ccc ccg gat aac acc ccc ggg gcg aag ggc acc ggg gag . 
Y RA LA G D P S D N I PG V KG l C E 
.601/201 631/211 
aag" acc gee cag agg cCc aCc cgc gag Cgg-ggg age ctg gaa aac ccc etc cag cac ctg 
K TAQR L I'R EW G SLE N LF. QH L 
661/221 691/231 

gac cag gcg aag ccc ccc ccg egg gag aag ccc cag gcg ggc atg gag gee ccg gcc ctt 
D QVK P S L R E K LQAC M-EALA L 
721/2<11 751/251 

tec egg aag etc ccc cag gcg cac acc gac ccg ccc ctg gag gcg gac etc ggg agg cgc 
S RKLS QV II TO L PLEV D FG R R 
781/261 tfU/271 

cgc aca ccc aac ccg gag ggc ccg egg get ccc ctg gag egg ccg gag ccc gga age ccc 
R TPNL E GLRA FLEKL EFGSL 
841/281 871/291 

etc cac gag ttc ggc etc ccg gag ggg ccg aag gcg gca gag gag gcc ccc egg ccc ccc 
L UE-FG LLECP KAAEEAPWPP 
901/301 ^ 931/311 

ccg gaa ggg gcc cct ccg ggc ccc ccc etc ccc cgt ccc gag ccc acg egg gcc gag ctt 
P EGAFLG FSFS RPEPMWAEL 
961/321 , ' 991/331 

ctg gcc ctg get ggg gcg tgg gag ggg cgc ccc cac egg gca caa gac ccc ccc agg ggc 
LALAGAWECR LHRAQOPLRC 
1021/341 10S1/3S1 

ccg agg gac ctt aag ggg gtg egg gga acc ctg gcc aag gac ccg gcg gcc ctg gcc ccg 
LRDL 'KC 'VRCI LAKDLAVLAI. 
1081/361 1111/371 

egg gag ggc ccg gac ccc CCc cca gag gac gac ccc atg ccc ctg gcc cac cct ccg gac 
R EGLD L F .r ED DPMLL AY LLD 
1141/381 1171/391 

ccc ccc aac acc acc ccc gag ggg gcg gcc cyg cgc cac ggg ggg gag egg acg gag gac 
PSNTTP EGVA RR YGCEWTED 
1201/401 1231/411 

gcg ggg gag agg gcc ccc ctg gcc gag cgc ccc ttc cag acc eta aag gag cgc ccc aag 
A .G !•; R A L L A E R L F 0 T L K E R I. K 
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1261/421 1291/431 

gg« gaa gaa cgc ccg etc tgg etc tac gag gag gtg gag aag ccg etc tec egg gtg ttg 
G .EERLLWLY EEVEKPLSRVL 
1321/441 1351/451 

gcc egg acg gag gcc acg ggg gtc egg cug gac gcg gpc cac ccc cag gcc etc ccc ccg 
^ ' ^ M E A T G V R L DV A Y L Q A L S L 
.1381/461 1411/471 

gag gcg gag gcg gag gtg cqc cag cCg gag gag gag gtc ttc cgc ccg gcc ggc eac ccc 
EVEAEVRQ. L EEEVFRLAGHP 
1441/481 1471/491 

etc aac etc aac ccc cgc gac cag ctg gag egg gtg etc etc gac gag ctg ggc ccg cet 
FN LNSRDQL ERVL FDELG L P 

1501/501 1531/511 

gcc acc ggc aag acg gag aag acg ggg aaa cgc tec acc age get gcc gcg ctg gag gcc 
AIGKTEKT G KRSTSAAVL EA 
1561/521 1591/531 

ctg cga gag gcc eac ecc ace gtg gac cgc ate ctg cag tac egg gag etc acc aag etc 
LREAHPIVDRI LQYR ELTKL 
1621/541 1651/551 

aag aac acc tac ata gac ccc ctg ccc gcc ctg gcc cac ccc aag acc ggc egg etc cac 
K NTY I D P LP A L VH P KTGR L H 
1631/561 1711/S71 

acc cgc etc aac cag acg gcc acc gcc aog ggc agg etc tec age tec gac ccc aac ctg 
TRFNQTATATGR LSSSDPNL 
1741/581 1771/591 

cag aac ate ecc gcg cgc acc ect ctg ggc cag cgc ate cgc cga gcc ttc gtg gcc gag 
QN I PV R T P L C OR I R R A FV A E 
1801/601 1031/611 

gag ggc Cgg gcg ctg gtg gcc ccg gac tac age cag ate gag etc egg gtc ccg gcc cac 
EGWVLVVLDYSQIELRVLAH 
1861/621 . 1891/631 

ccc tec ggg gac gag aac ccg ate egg gcc ccc cag gag ggg agg gac acc cac acc cag 
LS GDEN L I R V FQEG R 01 11 TQ 
1921/S41 1951/651 

acc gcc age Cgg atg etc ggc gCC tec ccc gaa ggg gta gac ect ctg acg cgc egg gcg 
TASWMFGVS PEGVDPLMRRA 
1981/661 2011/671 

gcc aag ace ate aac ttc ggg gtg ccc tac ggc acg ccc gcc cac cgc ccc tec ggg gag 
A K T I N F C V L Y G M S A H R L S G E 
2041/681 2071/691 

etc Ccc acc ccc cac gag gag gcg gcg gcc ccc acc gag cgc cac ttc cag age tac ccc 
LS IPYEEAVAFIERYFQSY P 
2101/701 2131/711 

aag gtg egg gcc tgg ate gag ggg acc ccc gag gag ggc cgc cgg egg ggg tat gtg gag 
K V R AW I E G T L EE C R R RG YV E' 
2161/721 2191/731 

acc ecc ttc ggc cgc cgg cgc tac gcg ccc gac ccc aac gcc cgg gcg aag age gtg cgc 
TLFGRRRYV PDL NA.RVKSVR 
2221/741 2251/751 

gag gcg gcg gag cgc atg gcc tee aac atg ccg gcc cag ggc acc gee gcc gac etc acg 
EAAERM.AFN M PVQGl'AAOLM 

2281/761 2311/771 

aag ctg gee atg gtg cgg etc ccc ecc cgg etc cag gaa ccg ggg gcg agg acg etc ccg 
Kt-AMVn L .rP R LOELGARMLL 
2341/781 2371/791 

cag gcg cac gac gag ctg gtc etc gag gcc ccc aag gac cgg gcg gag agy gca gcc get 
OV'^OELVLEA PKDRAERVAA 
2401/801 2431/811 

ttg gcc aag gag gtc atg gag ggg gtc cgg ccc ctg cag gcg ccc ctg gag gtg aag gcg 
I-AKEVME GVW PLQV PJ-EVEV 
246l/fl21 2491/831 
ggc ccg ggg gag gac cgg etc ccc gcc aag gag cag 
GLGEDWLSAKE*' 
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FIG. 4 
(sheet L) 



)fJX sequence 2S05 b.p. ATGGAOGCGATG ... GCCAAGCGTTAG linear 

' I . 

ccxJlng sequence of T. thecmophilus DMA polymerase as cubmicted by D. Gelfand in WO 91/099SO PCTV/XJSvO /Ul 



u/1 01/11 

\TG GAG GCG ATG CTT CCG CTC TIT GAA CCC AAA GGC COG GTC CTC CTC GTC GAC GGC CAC 
-lEAMLPLFEPKG RVLLVDGH 
61/21 91/31 

ZAC CTG GCC TAC CGC ACC TTC TTC CCC CTG AAG GGC. CTC ACC ACG AGC CGG. OGC OAA CCC . 
HLAYRTFFALKG LTTSRGEP' 
121/41 151/51 

arc CAG OCX; crc tag o3c tic goc aag aoc ctc arc aag gcc ctg aag gag gac cgg tac 

VQAVYCFAKSLLKA .LKEDGY 
101/61 211/71 

AAG GCC GTC TTC CTG GTC TTT GAC OCC AAG ' GCC CCC TCC TIC CGC CAC GAG GCC TAC GAG 
KA VFVVFDAKA P SF RKEAY E 
241/81 271/91 

GCC TAC AAG GCG CGG AOG GCC CCG ACC CCC GAG GAC TfC CCC CGG CAG CTC GCC CIC ATI 
AYKAGRAPTPEDFP ROLALI 
301/101 331/111 

AAG GAG CTC CTC CAC CTC CTG GCC TTT AOC COC CTC GAC GTC CCC GOC TAC GAG CCC GAC 

k elvollgftr Levpcye'ad 

361/121 391/131 

GAC CTT CTC COC ACC CTG OCC AAC AAG OOC GAA AAG GAG GCC TAC CAC GTC COC ATC CTC 
DVLATLAKKAEK ECYEVRIL 
421/Kl 451/151 

ACC GCC GAC CGC GAC CTC TAC CAA CTC GTC TCC GAC CCC CTC CCC GTC CTC CAC CCC CAC 
TADR D LYOLV S D RVAV LH P E 
481/161 511/171 

GOC CAC CTC ATC ACC CCC GAG TCC CTT TOG CAC AAC TAC GOC CTC ACC CCC GAC CAG TGC . 
GHLIT^PEWLWEK VCLRPEQW 

541/181 ■ , ' 571/191 

CTG GAC TTC CGC GCC CTC GTC GCG CAC CCC TCC GAC AAC CPC CCC CGG GTC AAG CCC ATC 
VOFRALVGDPSDNLPCVKCI 
601/201 631/211 

OGG GAG AAC ACC CCC CTC AAG CTC CTC AAC GAG TOG OCA AOC C1T3 GAA AAC CTC CTC AAG 
G EKTA L K L LK.EW GSLEN LL K 
661/221 691/231 

AAC CTG GAC CGG GTA AAG CCA GAA AAC GTC CGG GAG AAG AlC AAG GOC C^C CTC GAA CAC . 
NLDRVKPE NVR EKIKAHLEO 
721/241 751/251 

CTC AOG CTC TCC TTG GAG CTC TCC CGG CTG CGC ACC GAC CPC CCC CTG GAG CTG CAC CTC 
L R LS L E L S RV R T D L P L EV O L 
781/261 811/271 

GCC CAG GGC CGC GAG CCC GAC CCC GAC GOG CTT AGC GCC rPC CIC GAC AGC CPC CAG TTC 
AQG R E P on E G L R AFL E R L E F 
941/281 371/291 

GGC AGC CTC CTC CAC GAG TIC GGC CTC CTG GAG GCC CCC GCC CCC CIG GAG GAG GCC CCC 
CSLLHEFCLL EA PAPLEEAP 
901/301 931/311 
' TCC CCC CCG CCG GAA OGG CCC TTC CTC COC TTC GTC CTC TCC (rCC CCC CAC CCC ATC TCG 
WPPPECAFVC FV LSRPEPMU 
961/321 991/331 

GCG GAG CTT AAA GCC CTG GCC CCC TGC AGG GAC GGC CCC GTC CJ-.C COC GCA CCA GAC CCC 
A ELK A LA ACn DC RVIIHAADP 
1021/341. 1051/351 

TTC OCG GGC CTA AAG GAC CTC AAG CAG GTC OGG GGC CTC CIC CCC AAC CAC CTC GCC GTC 
LAGLK OLK EV 7t G LLAKDLAV 
lOGl/361 , 1111/371 

TTC GCC TCC AGG GAG GGC CTA CAC CTC GTC CCC GGC GAC CAC CCC A lC CPC CTC CCC TAC 
LASREGLDLV PC DO PmLUAY 

ii4i/3fii irn/391 

CTC CTT7 GAC CCC TCC AAC ACC ACC CCC CAG GCG CTG OCC CCJ C'::C VM. • V wO; CAG 'ICC 
L L 0 r S N T T P E O V A R R Y 'J 0 E w 
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1201/401 1231/411 

ACG GAG GAC GCC GCC CAC CGG GCC CTC CT:: ICG GAG ACG CIC CAT CGC AAC CTC CTf AAG 
T E D A A H R A L,L S E fl L l\ R N L L K 

•1261/421 1291/01 . 

CGC CTC GAG GGG GAG GAG AAG CTC CTT TGG CPC TAC CAC GAG GTG GAA AAG CCC CTC TCC 
Rl, EG E EK LLW U YM EV EK P LS 

1^21/441 13S1/451 ^ - - 

CGG GTC CTG GCC CAC ATC GAG GCC ACC GOG GTA CGG CTC GAC GTG CCC TAC CTT CAG GCC 
RVLAHMEATGVRLOVAYLQA 

1381/461 1411/471 

CTT ICC CTC GAG CTT GCG GAG GAG ATC CGC CGC CTC GAG GAG GAG GTC TIC CGC TTG GCG . 
LSLELAEEIRRLEEEV, FRLA 

1441/481 1471/491 

GOC CAC CdC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA AGG GTG CTC TIT GAC GAG CTT 
Gil P F N LN S R DO L £ R V L'F D E L 
1501/501 1531/Sll 

AGG CIT CCC GCC TTC GGG AAG ACG CAA AAG ACA GCX: AAG CGC TCC ACC AGC GCC GCG GTG 
RLPALGKTQKTGKRSTSAA V 

1561/521 ' 1S91/S31 

CTG GAG GCC CTA CGG GAG GCC CAC CCC ATC GTG GAG AAG ATC CTC CAG CAC CGG GAG CTC 
LEALREAHPIVEKlLQUnSL 
1621/541 1651/S51 

ACC AAG CTC AAG AAC ACC TAC GTG GAC CCC CIC CCA AGC CTC GTC CAC CCC AGG ACG GGC 
XKLKNTY VDPLPSl-VHPRTC 

UOl/561 1711/571 

CGC CIC CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG GOG AGG CTT ACT AGC TCC CAC 
riLHTR FNQTATATG R LS SS D 
1741/581 1771/591 

CCC AAC CTG CAG AAC ATC CCC GHC CGC ACC CCC TIC GGC CAG AGG A'lC CGC CGG GCC TCC 
PN LQN I P VRTPLC QR IRRA F 
1801/601 - 1831/611 

GTG GCC GAG GCG GCr TGG GCC TTG GTG GCC CFG GAC TAT ACC CAG ATA GAG CTC CGC GTC 
VAEAGWALVALD YSQIELRV 

1861/621 1891/631 ' ■ 

CTC OCC CAC CIC TCC OGG GAC GAA AAC CIC AlC AGG GTC ITC CAG GAG GOG AAG GAC AlC 
LAHL S GD EN LIRVFQEG KD I' 
1921/641 19S1/6S1 

CAC ACC CAG ACC GCA AGC TGG ATG TTC GOC CTC CCC CCG CAC CCC GTC GAc CCC CTC ATC. 
HTQTA SWHF CVPPEAVDPUM 
1981/661 2011/6'?1 

CGC CGG GCG GCC AAG ACG GTC AAC TTC GGC GTC CIC TAC GGC ATG TCC GCC CAT AGG CPC 
R R A A K T V N F G V L Y C M S. A U R L 
2041/681 2071/691 

TCC CAG GAG CTT GCC ATC CCC TAC GAG GAG GCG GTG OCC TIT ATA GAG CGC TAC TIC C\A 
s^q ela IP ye EAVAFI E R. YFO 
2101/701 ' 2131/711 

AGC TIC CCC AAG GTG CGC GCC TGG ATA GAA AAG ACC CTG GAG GAG GGG AGG AAC CCG CCC 
SF PKV RAW I EKTLEEGR.KRG 
2161/721 2191/731 

TAC Crc GAA ACC CTC TTC GCA AGA AGG CGC TAC GTC CCC GAC CTC A/vC GCC CGC GTG AAO 
YVETL FCRRRYVP DLNARVK 
2221/741 22S1/7S1 

AGC CTC AGG GAC GCC GCC CAC CGC ATG CCC ITC AAC ATC CCC CIC CAG C^JC ACC GCC ay: 
S V R EA AERH A FfJM.rvOOTAA 
2281/761 2311/T71 

GAC CIC ATG AAG CTC GCC ATC GTC AAG CTC TIC CCC CGC CTC CGG GAG ATG CGC CCC CCC 
D L M K L A MV K L F P R L R EM G A R 
2341/781 2371/791 

ATC CTC CIC CAG GTC CAC GAC GAG CTC CTC CTC GAC OCC CCC CAA GCG 030 CCC GAG GAG 
HI. LQV HDELLLEAI'OA RAEE- 
2401/601 2431/811 

GTC GCG GCT TIC GCC AAG GAG CCC ATC CAG AAC CCC TAT CCC CTC CCC CIC CCC CTC CAG 
VA AtA KEAMEKAYPUAV PL E 
. 2161/B2I 249l/8:i 

GTC GAG GTG GGG ATG GGG CAC GAC TCC CTT TCC CCC A/\C CCl' T AC 
V GVGMGEOWLSAKC 
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FIG. 5 

(Sheet 1) 

DNA and protein sequence of the coding region o£ pMR8 , encoding FY4 
l/l 31/11 

ATG CTG GAA CGT CTG GAA TTC GGC AGC CTC CTC CAC GAG TTC GGC CTC CTG GAG GCC CCC 
M L E'RLE FGS L L H E FG L L EA P 

61/21 91/31 

GCC CCC CTG GAG GAG GCC CCC TGG CCC CCG CCG GAA GGG GCC TTC GTG GGC TTC GTC CTC 
A PLEEAPWPPPEGAFVGFVL 
121/41 151/51 

TCC CGC CCC GAG CCC ATG TGG GCG GAG CTT AAA GCC CTG GCC GCC TGC AGG GAC GGC CGG 
S R.P E PM WAEL K A L A AC R DG R 
101/61 211/71 

GTG CAC CGG GCA GCA GAC CCC TTG GCG GGG CTA AAG GAC CTC AAG GAG -"GTC CGG GGC CTC 
V H R A A D P L A G- L K D L K E V R G L 
241/01 271/91 

CTC GCC AAG GAC CTC GCC GTC, TTG GCC TCG AGG GAG GGG CTA GAC CTC GTG CCC GGG GAC 
•L A KDLAVLAS R EG LD LV PG D 
301/101 331/111 

GAC CCC ATG CTC CTC GCC TAC CTC CTG GAC CCC TCC AAC ACC ACC CCC GAG GGG GTG GCG 
D P.M L LA Y L L D P S .N T T P E G-V A 
361/121 391/131 

CGG CGC TAC GGG GGG GAG TGG ACG GAG GAC GCC GCC CAC CGG GCC CTC CTC TCG GAG AGG 
R R Y G G E W T.E D A A H R A L L S E R 
421/141 451/151 

CTC CAT CGG AAC CTC CTT AAG CGC CTC GAG GGG GAG GAG AAG CTC CTT TGG CTC TAC CAC 
L H RN LLKR L EG E E K L LW L, Y H 
481/161 Sll/171 

GAG GTG GAA AAG CCC CTC TCC CGG GTC CTG GCC CAC ATG GAG GCC ACC GGG GTA CGG CTG 
EVEKPLSRVLA HM E ATGVR- L 
541/181' 571/191 

GAC GTG GCC TAC CTT CAG GCC CTT TCC CTG GAG CTT GCG GAG GAG ATC CGC CGC CTC GAG 
DVAYLQALSLELAEEIRRLE 
601/201 631/211 

GAG GAG GTC TTC CGC TTG GCG GGC CAC CCC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA 
E E V F R Lr A G H P F N L N S R D Q L E 
661/221 691/231 

AGG GTG CTC TTT GAC GAG CTT AGG CTT CCC GCC TTG GGG AAG ACG CAA AAG ACA GGC AAG 
RVLFDELRLPA LG KTQKT-G,K 
721/241 751/251 

CGC TCC ACC AGC GCC GCG GTG CTG GAG GCC CTA CGG GAG GCC CAC CCC ATC GTG GAG AAG 
RSTSAAVLEALREAHPIVEK 
701/261 811/271 

ATC CTC CAG CAC CGG GAG CTC ACC AAG CTC AAG AAC ACC TAC GTG GAC CCC CTC CCA AGC 
I LQH.RELTKL.KNTYVDPLPS 
841/281 871/291 ' 

CTC GTC CAC CCG AGG ACG GGC CGC CTC CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG 
LV'H PR TGRLHT RF NQTATAT 
901/301 931/311 

GGG AGG CTT AGT AGC TCC GAC CCC AAC CTG CAG AAC ATC CCC GTC CGC ACC CCC TTG GGC 
GRLSSSDPNLQNI PVRTPLG 
961/321 991/331 

CAG AGG ATC CGC CGG GCC TTC GTG GGC GAG GCG GGT TGG GCG TTG GTG GCC CTG GAC TAT 
Q R I RRAFVA EA GW A. L VA LDY 
1021/341 1051/351 
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FIG 5. 

(Sheet 2) 

AGC CAG ATA GAG CTC CGC GTC CTC GCC CAC CTC TCC GGG GAC GAA AAC CTG ATC AGG GTC 
SQIE L RVLAHLSGDENL IRV 
lOai/361 1111/371 

TTC CAG GAG GGG AAG GAC ATC CAC ACC CAG ACC GCA AGC TGG ATG TTC GGC GTC CCC CCG 
FQEGKDIHTQ TASWMFGVPP^ 
1141/381 11-71/391 

GAG GCC GTG .GAC CCC CTG ATG GGC CGG GCG GCC AAG ACG GTG AAC TAC GGC GTC CTC TAC 
EAVDPLMRR A- A K T V N Y G V L Y 
1201/401 1231/411 

GGC ATG TCC GCC CAT AGG CTC TCC CAG GAG CTA GCC ATC CCC TAC GAA GAA GCG GTG GCC 
GMSAHRLSQELA 1 PYEEAV A 
1261/421 ^ 1291/431 

TTT ATA GAG CGC TAC TTC CAA AGC TTC CCC AAG GTG CGG GCC TGG ATA GAA AAG ACC CTG 
r I E R Y F Q S F P .K V R A "W I.E K T L 

1321/441 ' 1351/451 

GAG GAG GGG AGG AAG CGG GGC TAC GTG GAA ACC CTC TTC GGA AGA AGG CGC TAC GTG CCC 
EEGR'KRGYVETLPGRRRYVP 
1381/461 1411/471 

.GAC CTC AAC GCC CGG GTG AAG AGC GTC AGG GAG GCC GCG GA'G CGC ATG GCC TTC AAC ATG 
PL.NA RVKSVREAAERMAFNM 
1441/481 . . 1471/491 

CCC GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTC GCC ATG GTG AAG CTC TTC CCC CGC 
PV-QGTAADLMKLAMVKL F PR 
1501/501 1531/511 

CTC CGG GAG ATG GGG GCC CGC ATG CTC CTC CAG GTC CAC GAC GAG CTC CTC . CTG GAG GCC 
L R E M G A R.M U L Q.V H D E L L L E A 
1561/521 1591/531 

CCC CAA GCG CGG GCC GAG GAG GTG GCG GCT TTG GCC AAG GAG GCC ATG GAG AAG GCC TAT 
PQARA EEVAAL A KEAMEKAY 
1621/541 1651/551 

CCC CTC GCC GTG CCC CTG GAG GTG GAG GTG GGG ATG GGG GAG GAC TGG CTT TCC GCC AAG 
P LA V P L EV EVG M G E DW L S.AK 
1681/561 
GGT TAG 
G * . 
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